More Collected Issues with Atomics

ISO/IEC JTC1 SC22 WG21 N2992 = 09-0182 - 2009-10-23

Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org

This paper revises N2925 = 09-0115 - 2009-08-02.

Introduction

This paper presents several topics. Each topic presents the related issues, then presents formal wording changes to various sections in the working draft. Proposed resolutions in the issues are not to be applied. Only the wording in paper sections with working draft section titles shall be applied.

Headers

Issues

LWG 1145, UK 312: inappropriate headers for atomics

Status: New

Submitter: LWG

Discussion: The contents of the <stdatomic.h> header are not listed anywhere, and <cstdatomic> is listed as a C99 header in chapter 17. If we intend to use these for compatibility with a future C standard, we should not use them now.

Proposed resolution: Remove <cstdatomic> from the C99 headers in table 14. Add a new header <atomic> to the headers in table 13. Update chapter 29 to remove reference to <stdatomic.h> and replace the use of <cstdatomic> with <atomic>.

[ If and when WG14 adds atomic operations to C we can add corresponding headers to table 14 with a TR. ]

Committee: May be resolvable with a footnote for clarity stating that the header is defined where it exists.

Wording

17.6.1.2 Headers [headers]

Edit table 13 as follows.

Table 13 — C++ library headers
<algorithm> <forward_list> <iterator_concepts> <queue> <system_error>

<array> <fstream> <limits> <random> <threads>

<bitset> <functional> <list> <ratio> <tuple>

<chrono> <future> <locale> <regex> <typeinfo>

<codecvt> <initializer_list> <map> <set> <type_traits>

<complex> <iomanip> <memory> <sstream> <unordered_map>

<concepts> <ios> <memory_concepts> <stack> <unordered_set>

<condition_variable> <iosfwd> <mutex> <stdexcept> <utility>

<container_concepts> <iostream> <new> <streambuf> <valarray>

<deque> <istream> <numeric> <string> <vector>

<exception> <iterator> <ostream> <strstream> <atomic>

Table 13 — C++ library headers
`<algorithm>`	`<forward_list>`	`<iterator_concepts>`	`<queue>`	`<system_error>`
`<array>`	`<fstream>`	`<limits>`	`<random>`	`<threads>`
`<bitset>`	`<functional>`	`<list>`	`<ratio>`	`<tuple>`
`<chrono>`	`<future>`	`<locale>`	`<regex>`	`<typeinfo>`
`<codecvt>`	`<initializer_list>`	`<map>`	`<set>`	`<type_traits>`
`<complex>`	`<iomanip>`	`<memory>`	`<sstream>`	`<unordered_map>`
`<concepts>`	`<ios>`	`<memory_concepts>`	`<stack>`	`<unordered_set>`
`<condition_variable>`	`<iosfwd>`	`<mutex>`	`<stdexcept>`	`<utility>`
`<container_concepts>`	`<iostream>`	`<new>`	`<streambuf>`	`<valarray>`
`<deque>`	`<istream>`	`<numeric>`	`<string>`	`<vector>`
`<exception>`	`<iterator>`	`<ostream>`	`<strstream>`	`<atomic>`

Edit table 14 as follows.

Table 14 — C++ headers for C library facilities
<cassert> <cinttypes> <csignal> <cstdio> <cwchar>

<ccomplex> <ciso646> <cstdarg> <cstdlib> <cwctype>

<cctype> <climits> ~~<cstdatomic>~~ <cstring>

<cerrno> <clocale> <cstdbool> <ctgmath>

<cfenv> <cmath> <cstddef> <ctime>

<cfloat> <csetjmp> <cstdint> <cuchar>

Table 14 — C++ headers for C library facilities
`<cassert>`	`<cinttypes>`	`<csignal>`	`<cstdio>`	`<cwchar>`
`<ccomplex>`	`<ciso646>`	`<cstdarg>`	`<cstdlib>`	`<cwctype>`
`<cctype>`	`<climits>`	~~`<cstdatomic>`~~	`<cstring>`
`<cerrno>`	`<clocale>`	`<cstdbool>`	`<ctgmath>`
`<cfenv>`	`<cmath>`	`<cstddef>`	`<ctime>`
`<cfloat>`	`<csetjmp>`	`<cstdint>`	`<cuchar>`

29.1 General [atomics.general]

Edit table 118 as follows.

Table 118 — Atomics library summary
Subclause Header(s)

29.3 Order and Consistency

29.4 Lock-free Property

29.5 Atomic Types <cstdatomic>, <stdatomic.h> <atomic>

29.6 Operations on Atomic Types

29.7 Flag Type and Operations

Table 118 — Atomics library summary
Subclause	Header(s)
29.3 Order and Consistency
29.4 Lock-free Property
29.5 Atomic Types	`<cstdatomic>, <stdatomic.h> <atomic>`
29.6 Operations on Atomic Types
29.7 Flag Type and Operations

29.2 Header `<cstdatomic>` synopsis [atomics.syn]

Edit the section title as follows.

29.2 Header <cstdatomic> <atomic> synopsis [atomics.syn]

Conceptification

While the July 2009 meeting of the committee voted to remove concepts from the C++0x effort, the definition of the atomic template must still be clear in its documentation about the concepts it uses.

Issues

UK 311: Conceptification

Comment: Atomic types cannot be used generically in a constrained template.

Suggestion: Provide constraints for the atomics library, clause 29.

US 87, LWG 1143: Atomic operations library not concept enabled

Status: New

Submitter: LWG

Discussion: The atomics chapter is not concept enabled.

Needs to also consider issues LWG 923 and LWG 924.

LWG 923: atomics with floating-point

Status: Open

Submitter: Herb Sutter

Discussion: Right now, C++0x doesn't have atomic<float>. We're thinking of adding the words to support it for TR2 (note: that would be slightly post-C++0x). If we need it, we could probably add the words.

Proposed resolutions: Using atomic<FP>::compare_exchange (weak or strong) should be either:

ill-formed, or
well-defined.

I propose Option 1 for C++0x for expediency. If someone wants to argue for Option 2, they need to say what exactly they want compare_exchange to mean in this case (IIRC, C++0x doesn't even assume IEEE 754).

[ Summit: ]

Move to open. Blocked until concepts for atomics are addressed.

[ Post Summit Anthony adds: ]

Recommend NAD. C++0x does have std::atomic<float>, and both compare_exchange_weak and compare_exchange_strong are well-defined in this case. Maybe change the note in 29.6 [atomics.types.operations] paragraph 20 to:
[Note: The effect of the compare-and-exchange operations is
if (!memcmp(object,expected,sizeof(*object)))
    *object = desired;
else
    *expected = *object;
This may result in failed comparisons for values that compare equal if the underlying type has padding bits or alternate representations of the same value. —end note]

Proposed resolution: Change the note in 29.6 [atomics.types.operations] paragraph 20 to:

[Note: The effect of the compare-and-exchange operations is
if (*object == *expected
    !memcmp(object,expected,sizeof(*object)))
    *object = desired;
else
    *expected = *object;
This may result in failed comparisons for values that compare equal if the underlying type has padding bits or alternate representations of the same value. —end note]

LWG 924: structs with internal padding

Status: Open

Submitter: Herb Sutter

Discussion: Right now, the compare_exchange_weak loop should rapidly converge on the padding contents. But compare_exchange_strong will require a bit more compiler work to ignore padding for comparison purposes.

Note that this isn't a problem for structs with no padding, and we do already have one portable way to ensure that there is no padding that covers the key use cases: Have elements be the same type. I suspect that the greatest need is for a structure of two pointers, which has no padding problem. I suspect the second need is a structure of a pointer and some form of an integer. If that integer is intptr_t, there will be no padding.

Related but separable issue: For unused bitfields, or other unused fields for that matter, we should probably say it's the programmer's responsibility to set them to zero or otherwise ensure they'll be ignored by memcmp.

Proposed resolutions: Using atomic<struct-with-padding>::compare_exchange_strong should be either:

ill-formed, or
well-defined.

I propose Option 1 for C++0x for expediency, though I'm not sure how to say it. I would be happy with Option 2, which I believe would mean that compare_exchange_strong would be implemented to avoid comparing padding bytes, or something equivalent such as always zeroing out padding when loading/storing/comparing. (Either implementation might require compiler support.)

[ Summit: ]

Move to open. Blocked until concepts for atomics are addressed.

[ Post Summit Anthony adds: ]

The resoultion of LWG 923 should resolve this issue as well.

Proposed resolution:

Wording

29.5.3 Generic Types [atomic.types.generic]

Edit paragraph 1 as follows.

There is a generic class template atomic<T>. The type of the template argument T shall be ~~trivially copy assignable and bitwise equality comparable~~ trivially copyable (3.9 [basic.types]). [Note: Type arguments that are not also statically initializable ~~and trivially destructable~~ may be difficult to use. —end note]

29.6 Operations on Atomic Types [atomics.types.operations]

Edit paragraph 18 as follows.

Effects: Atomically, compares the ~~value~~ contents of the memory pointed to by object or by this for equality with that in expected, and if true, replaces the ~~value~~ contents of the memory pointed to by object or by this with that in desired, and if false, updates the ~~value~~ contents of the memory in expected with the ~~value~~ contents of the memory pointed to by object or by this. Further, if the comparison is true, memory is affected according to the value of success, and if the comparison is false, memory is affected according to the value of failure. When only one memory_order argument is supplied, the value of success is order, and the value of failure is order except that a value of memory_order_acq_rel shall be replaced by the value memory_order_acquire and a value of memory_order_release shall be replaced by the value memory_order_relaxed. These operations are atomic read-modify-write operations (1.10).

Edit paragraph 20 as follows.

[Note: The effect of the compare-and-exchange operations is
if (*object == *expected
    memcmp(object,expected,sizeof(*object))==0)
    *object = desired
    memcpy(object,&desired,sizeof(*object));
else
    *expected = *object
    memcpy(expected,object,sizeof(*object));
—end note] [Example: The expected use of the compare-and-exchange operations is as follows. The compare-and-exchange operations will update expected when another iteration of the loop is needed.
expected = current.load();
do { desired = function(expected);
} while (!current.compare_exchange_weak(expected, desired));
—end example]

Edit paragraph 21 as follows.

Remark: The weak compare-and-exchange operations may fail spuriously, that is, return false while leaving the ~~value~~ contents of memory pointed to by ~~expected unchanged~~ expected before the operation is that same as that of the object and the same as that of expected after the operation. [Note: This spurious failure enables implementation of compare-and-exchange on a broader class of machines, e.g., load-locked store-conditional machines. —end note] [Example: A consequence of spurious failure is that nearly all uses of weak compare-and-exchange will be in a loop.
expected = current.load();
do desired = function(expected);
while (!current.compare_exchange(expected, desired));
When a compare-and-exchange is in a loop, the weak version will yield better performance on some platforms. When a weak compare-and-exchange would require a loop and a strong one would not, the strong one is preferable. ~~—end example]~~ —end note]

Insert a new paragraph after paragraph 21 as follows.

[Note: The memcpy and memcmp semantics of the compare-and-exchange operations may result in failed comparisons for values that compare equal with operator== if the underlying type has padding bits, has trap values, or has alternate representations of the same value. A consequence is that compare_exchange_strong should be used with extreme care. On the other hand, compare_exchange_weak should converge rapidly. —end note]

Fences

Issues

LWG 926: Sequentially consistent fences, relaxed operations and modification order

Status: Open

Submitter: Anthony Williams

Discussion: There was an interesting issue raised over on comp.programming.threads today regarding the following example


// Thread 1:
x.store(1, memory_order_relaxed);           // SX
atomic_thread_fence(memory_order_seq_cst);  // F1
y.store(1, memory_order_relaxed);           // SY1
atomic_thread_fence(memory_order_seq_cst);  // F2
r1 = y.load(memory_order_relaxed);          // RY

// Thread 2:
y.store(0, memory_order_relaxed);          // SY2
atomic_thread_fence(memory_order_seq_cst); // F3
r2 = x.load(memory_order_relaxed);         // RX

is the outcome r1 == 0 and r2 == 0 possible?

I think the intent is that this is not possible, but I am not sure the wording guarantees that. Here is my analysis:

Since all the fences are SC, there must be a total order between them. F1 must be before F2 in that order since they are in the same thread. Therefore F3 is either before F1, between F1 and F2 or after F2.

If F3 is after F2, then we can apply 29.3 [atomics.order]p5 from N2798:

For atomic operations A and B on an atomic object M, where A modifies M and B takes its value, if there are memory_order_seq_cst fences X and Y such that A is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B observes either the effects of A or a later modification of M in its modification order.

In this case, A is SX, B is RX, the fence X is F2 and the fence Y is F3, so RX must see 1.

If F3 is before F2, this doesn't apply, but F3 can therefore be before or after F1.

If F3 is after F1, the same logic applies, but this time the fence X is F1. Therefore again, RX must see 1.

Finally we have the case that F3 is before F1 in the SC ordering. There are now no guarantees about RX, and RX can see r2==0.

We can apply 29.3 [atomics.order]p5 again. This time, A is SY2, B is RY, X is F3 and Y is F1. Thus RY must observe the effects of SY2 or a later modification of y in its modification order.

Since SY1 is sequenced before RY, RY must observe the effects of SY1 or a later modification of y in its modification order.

In order to ensure that RY sees (r1==1), we must see that SY1 is later in the modification order of y than SY2.

We're now skating on thin ice. Conceptually, SY2 happens-before F3, F3 is SC-ordered before F1, F1 happens-before SY1, so SY1 is later in the modification order M of y, and RY must see the result of SY1 (r1==1). However, I don't think the words are clear on that.

[ Post Summit Hans adds: ]

In my (Hans') view, our definition of fences will always be weaker than what particular hardware will guarantee. Memory_order_seq_cst fences inherently don't guarantee sequential consistency anyway, for good reasons (e.g. because they can't enforce a total order on stores). Hence I don't think the issue demonstrates a gross failure to achieve what we intended to achieve. The example in question is a bit esoteric. Hence, in my view, living with the status quo certainly wouldn't be a disaster either.

In any case, we should probably add text along the lines of the following between p5 and p6 in 29.3 [atomics.order]:

[Note: Memory_order_seq_cst only ensures sequential consistency for a data-race-free program that uses exclusively memory_order_seq_cst operations. Any use of weaker ordering will invalidate this guarantee unless extreme care is used. In particular, memory_order_seq_cst fences only ensure a total order for the fences themselves. They cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering specifications. —end note]

Also see thread beginning at c++std-lib-23271.

[ Herve's correction: ]

Minor point, and sorry for the knee jerk reaction: I admit to having no knowledge of Memory_order_seq_cst, but my former boss (John Lakos) has ingrained an automatic introspection on the use of "only". I think you meant:

[Note: Memory_order_seq_cst ensures sequential consistency only for . . . . In particular, memory_order_seq_cst fences ensure a total order only for . . .

Unless, of course, Memory_order_seq_cst really do nothing but ensure sequential consistency for a data-race-free program that uses exclusively memory_order_seq_cst operations.

Proposed resolution: Add a new paragraph after 29.3 [atomics.order]p5 that says

For atomic operations A and B on an atomic object M, where A and B modify M, if there are memory_order_seq_cst fences X and Y such that A is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B occurs later than A in the modification order of M.

UK 313: seq_cst Fences

Comment: seq_cst fences don't necessarily guarantee ordering.

Suggestion: Add a new paragraph after 29.1 [atomics.order] p5 that says "For atomic operations A and B on an atomic object M, where A and B modify M, if there are memory_order_seq_cst fences X and Y such that A is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B occurs later than A in the modification order of M."

Disposition:

Committee:

Wording

29.3 Order and Consistency [atomics.order]

Add a new paragraph after paragraph 5 as follows.

For atomic operations A and B on an atomic object M, where A and B modify M, if there are memory_order_seq_cst fences X and Y such that A is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B occurs later than A in the modification order of M.

Add a new paragraph after the above paragraph as follows.

[Note: Memory_order_seq_cst ensures sequential consistency only for a data-race-free program that uses exclusively memory_order_seq_cst operations. Any use of weaker ordering will invalidate this guarantee unless extreme care is used. In particular, memory_order_seq_cst fences ensure a total order only for the fences themselves. They cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering specifications. —end note]

Lock Free

Issues

LWG 1146, US 88: "lockfree" does not say enough

Status: New

Submitter: Jeffrey Yasskin

Discussion: The "lockfree" facilities do not tell the programmer enough.

There are 2 problems here. First, at least on x86, it's less important to me whether some integral types are lock free than what is the largest type I can pass to atomic and have it be lock-free. For example, if long longs are not lock-free, ATOMIC_INTEGRAL_LOCK_FREE is probably 1, but I'd still be interested in knowing whether longs are always lock-free. Or if long longs at any address are lock-free, I'd expect ATOMIC_INTEGRAL_LOCK_FREE to be 2, but I may actually care whether I have access to the cmpxchg16b instruction. None of the support here helps with that question. (There are really 2 related questions here: what alignment requirements are there for lock-free access; and what processor is the program actually running on, as opposed to what it was compiled for?)

Second, having atomic_is_lock_free only apply to individual objects is pretty useless (except, as Lawrence Crowl points out, for throwing an exception when an object is unexpectedly not lock-free). I'm likely to want to use its result to decide what algorithm to use, and that algorithm is probably going to allocate new memory containing atomic objects and then try to act on them. If I can't predict the lock-freedom of the new object by checking the lock-freedom of an existing object, I may discover after starting the algorithm that I can't continue.

[ 2009-06-16 Jeffrey Yasskin adds: ]

To solve the first problem, I think 2 macros would help: MAX_POSSIBLE_LOCK_FREE_SIZE and MAX_GUARANTEED_LOCK_FREE_SIZE, which expand to the maximum value of sizeof(T) for which atomic may (or will, respectively) use lock-free operations. Lawrence points out that this "relies heavily on implementations using word-size compare-swap on sub-word-size types, which in turn requires address modulation." He expects that to be the end state anyway, so it doesn't bother him much.

To solve the second, I think one could specify that equally aligned objects of the same type will return the same value from atomic_is_lock_free(). I don't know how to specify "equal alignment". Lawrence suggests an additional function, atomic_is_always_lock_free().

Committee:

Will expand ATOMIC_INTEGRAL_LOCK_FREE to add a macro for each integer size — one for char/schar/uchar, one for short/ushort, etc.
Will require that atomic types are aligned to necessary alignment to make them lock-free if possible
Will make is_lock_free apply to all instances of a given type, allowing a null pointer to be passed to the namespace-level function
Relatedly, Mike Spertus will create an issue to propose adding a traits mechanism to check the compile-time properties through a template mechanism rather than macros

Wording

29.4 Lock-free Property [atomics.lockfree]

Edit the synopsis as follows.


namespace std {
#define ATOMIC_INTEGRAL_LOCK_FREE unspecified
#define ATOMIC_CHAR_LOCK_FREE implementation-defined
#define ATOMIC_CHAR16_T_LOCK_FREE implementation-defined
#define ATOMIC_CHAR32_T_LOCK_FREE implementation-defined
#define ATOMIC_WCHAR_T_LOCK_FREE implementation-defined
#define ATOMIC_SHORT_LOCK_FREE implementation-defined
#define ATOMIC_INT_LOCK_FREE implementation-defined
#define ATOMIC_LONG_LOCK_FREE implementation-defined
#define ATOMIC_LLONG_LOCK_FREE implementation-defined
#define ATOMIC_ADDRESS_LOCK_FREE unspecified implementation-defined
}

Edit paragraph 1 as follows.

The ATOMIC_...._LOCK_FREE macros ~~ATOMIC_INTEGRAL_LOCK_FREE and ATOMIC_ADDRESS_LOCK_FREE~~ indicate the ~~general~~ lock-free property of ~~integral and address~~ the corresponding atomic types, with the signed and unsigned variants grouped together. The properties also apply to the corresponding specializations of the atomic template. A value of 0 indicates that the types are never lock-free. A value of 1 indicates that the types are sometimes lock-free. A value of 2 indicates that the types are always lock-free.

Edit paragraph 2 as follows.

The function atomic_is_lock_free (29.6) indicates whether the ~~object~~ type is lock-free. ~~The result of a lock-free query on one object cannot be inferred from the result of a lock-free query on another object.~~ In any given program execution, the result of the lock-free query shall be consistent for all pointers of the same type.

Typedefs

Issues

LWG 937, US 89

Comment: The types in the table "Atomics for standard typedef types" should be typedefs, not classes. These semantics are necessary for compatibility with C.

Suggestion: Change the classes to typedefs.

Committee: Direct the editor to turn the types into typedefs as proposed in the comment. Paper approved by committee used typedefs, this appears to have been introduced as an editorial change. Rationale: for compatibility with C.

LWG 943: `ssize_t` undefined

Status: Tentatively Ready Submitter: Holger Grund Opened: 2008-12-19 Last modified: 2009-05-23

View all issues with Tentatively Ready status.

Discussion:

There is a row in "Table 122 - Atomics for standard typedef types" in 29.5.1 [atomics.types.integral] with atomic_ssize_t and ssize_t. Unless, I'm missing something ssize_t is not defined by the standard.

[ Summit: ]

Move to review. Proposed resolution: Remove the typedef. Note: ssize_t is a POSIX type.

[ Batavia (2009-05): ]

We agree with the proposed resolution. Move to Tentatively Ready.

Proposed resolution:

Remove the row containing ssize_t from Table 119 "Atomics for standard typedef types" in 29.5.2 [atomics.types.address].

Wording

29.5.1 Integral Types [atomics.types.integral]

Edit paragraph 1 as follows.

The name atomic_itype and the functions operating on it in the preceding synopsis are placeholders for a set of classes and functions. Throughout the preceding synopsis, atomic_itype should be replaced by each of the class names in table 119 ~~and table 120,~~ and integral should be replaced by the integral type corresponding to the class name. Table 120 shows typedefs to atomic integral classes and the corresponding <stdint.h> typedefs.

In table 119, remove the row containing ssize_t.

Edit the heading in table 120 as follows.

~~Class name~~ atomic typedef name ~~Integral type~~ stdint typedef name

Volatile

Issues

LWG 1147, US 90: non-volatile atomic functions

Status: New

Submitter: Jeffrey Yasskin

Discussion The C++0X draft declares all of the functions dealing with atomics (section 29.6 [atomics.types.operations]) to take volatile arguments. Yet it also says (29.4-3),

[Note: Many operations are volatile-qualified. The "volatile as device register" semantics have not changed in the standard. This qualification means that volatility is preserved when applying these operations to volatile objects. It does not mean that operations on non-volatile objects become volatile. Thus, volatile qualified operations on non-volatile objects may be merged under some conditions. —end note ]

I was thinking about how to implement this in gcc, and I believe that we'll want to overload most of the functions on volatile and non-volatile. Here's why:

To let the compiler take advantage of the permission to merge non-volatile atomic operations and reorder atomics in certain, we'll need to tell the compiler backend about exactly which atomic operation was used. So I expect most of the functions of the form atomic_<op>_explicit() (e.g. atomic_load_explicit, atomic_exchange_explicit, atomic_fetch_add_explicit, etc.) to become compiler builtins. A builtin can tell whether its argument was volatile or not, so those functions don't really need extra explicit overloads. However, I don't expect that we'll want to add builtins for every function in chapter 29, since most can be implemented in terms of the _explicit free functions:


class atomic_int {
  __atomic_int_storage value;
 public:
  int fetch_add(int increment, memory_order order = memory_order_seq_cst) volatile {
    // &value has type "volatile __atomic_int_storage*".
    atomic_fetch_add_explicit(&value, increment, order);
  }
  ...
};

But now this always calls the volatile builtin version of atomic_fetch_add_explicit(), even if the atomic_int wasn't declared volatile. To preserve volatility and the compiler's permission to optimize, I'd need to write:


class atomic_int {
  __atomic_int_storage value;
 public:
  int fetch_add(int increment, memory_order order = memory_order_seq_cst) volatile {
    atomic_fetch_add_explicit(&value, increment, order);
  }
  int fetch_add(int increment, memory_order order = memory_order_seq_cst) {
    atomic_fetch_add_explicit(&value, increment, order);
  }
  ...
};

But this is visibly different from the declarations in the standard because it's now overloaded. (Consider passing &atomic_int::fetch_add as a template parameter.)

The implementation may already have permission to add overloads to the member functions:

17.6.4.5 [member.functions] An implementation may declare additional non-virtual member function signatures within a class:
...

by adding a member function signature for a member function name.

but I don't see an equivalent permission to add overloads to the free functions.

[ 2009-06-16 Lawrence adds: ]

I recommend allowing non-volatile overloads.

Committee: Should explicitly consider the process shared issue. 908: Move to open. Assign to Lawrence. Related to US 90 comment.

C committee does not want to make promises about optimizations
Current note in paragraph 3 of [atomics.types.operations] is based on a wrong assumption
C++ resolution: add the non-volatile overloads.
Expectation that the C standard will be phrased in such a way that non-volatile atomics are not necessarily forced to be volatile, even if the standard only specifies the volatile forms
Related issue: process-shared atomics. More work needs to be done, at a minimum POSIX will need to specify this.

LWG 908: Deleted assignment operators for atomic types must be volatile

Status: Open

Submitter: Anthony Williams

The deleted copy-assignment operators for the atomic types are not marked as volatile in N2723, whereas the assignment operators from the associated non-atomic types are. e.g.


atomic_bool& operator=(atomic_bool const&) = delete;
atomic_bool& operator=(bool) volatile;

This leads to ambiguity when assigning a non-atomic value to a non-volatile instance of an atomic type:


atomic_bool b;
b=false;

Both assignment operators require a standard conversions: the copy-assignment operator can use the implicit atomic_bool(bool) conversion constructor to convert false to an instance of atomic_bool, or b can undergo a qualification conversion in order to use the assignment from a plain bool.

This is only a problem once issue 845 is applied.

[ Summit: ]

Move to open. Assign to Lawrence. Related to US 90 comment.

Proposed resolution:

Add volatile qualification to the deleted copy-assignment operator of all the atomic types:


atomic_bool& operator=(atomic_bool const&) volatile = delete;
atomic_itype& operator=(atomic_itype const&) volatile = delete;

etc.

This will mean that the deleted copy-assignment operator will require two conversions in the above example, and thus be a worse match than the assignment from plain bool.

Wording

29 Atomic operations library [atomics]

For each volatile qualified function or function with volatile qualified parameter, add a non-volatile qualified version. For each version of the assignment operator, add a volatile qualified version.

Memory Order of Compare Exchange

Issues

US 91: Failed Compare Exchange

Comment: Whether or not a failed compare_exchange is a RMW operation (as used in 1.10 [intro.multithread]) is unclear.

Suggestion: Make failing compare_exchange operations not be RMW. See the attached paper under "atomic RMW status of failed compare_exchange".

Disposition: Accepted.

Committee:

LWG 1043: Response to US 91

Status: Review

Submitter: Alisdair Meredith

It is unclear whether or not a failed compare_exchange is a RMW operation (as used in 1.10 [intro.multithread]).

US 92: RMW with Consume

Comment: The effect of memory_order_consume with atomic RMW operations is unclear.

Suggestion: Follow the lead of fences [atomics.fences], and promote memory_order_consume to memory_order_acquire with RMW operations.

Disposition: NAD. We can not see the issue being suggested by the comment.

Committee:

Wording

29.6 Operations on Atomic Types [atomics.types.operations]

Edit paragraph 18 as follows.

Effects: Atomically, compares the value pointed to by object or by this for equality with that in expected, and if true, replaces the value pointed to by object or by this with desired, and if false, updates the value in expected with the value pointed to by object or by this. Further, if the comparison is true, memory is affected according to the value of success, and if the comparison is false, memory is affected according to the value of failure. When only one memory_order argument is supplied, the value of success is order, and the value of failure is order except that a value of memory_order_acq_rel shall be replaced by the value memory_order_acquire and a value of memory_order_release shall be replaced by the value memory_order_relaxed. If the operation returns true, ~~These~~ these operations are atomic read-modify-write operations (1.10). Otherwise, these operations are atomic load operations.

Atomic Flag

Issues

LWG ?, WG14: Flag Initialization

Comment:

Suggestion:

Disposition:

Committee:

C committee believes atomic_flag's initialization should be indeterminate
C++ will add atomic_flag initialized to be indeterminate
May not be necessary to add any language to clear to specify that calling it will not cause undefined behaviour if the atomic_flag is indeterminate, but if necessary, will be added
Clark notes there is no paper trail (e.g. an NB comment) justifying this change. Requires a liaison statement from the C committee. Expect this will happen.

Wording

29.7 Flag Type and Operations [atomics.flag]

Edit paragraph 4 as follows.

The macro ATOMIC_FLAG_INIT shall be defined in such a way that it can be used to initialize an object of type atomic_flag to the clear state. For a static-duration object, that initialization shall be static. ~~A program that uses an object of type atomic_flag without initializing it with the macro ATOMIC_FLAG_INIT is ill-formed.~~ An uninitialized atomic_flag shall indeterminately have an initial state of either set or clear. [Example:
atomic_flag guard = ATOMIC_FLAG_INIT;
—end example]

Derivation

Issues

LWG 944: `atomic<bool>` derive from `atomic_bool`?

Status: Open

Submitter: Holger Grund

Discussion: I think it's fairly obvious that atomic<bool> is supposed to be derived from atomic_bool (and otherwise follow the atomic<integral> interface), though I think the current wording doesn't support this. I raised this point along with atomic<floating-point> privately with Herb and I seem to recall it came up in the resulting discussion on this list. However, I don't see anything on the current libs issue list mentioning this problem.

29.5.3 [atomics.types.generic]/3 reads

There are full specializations over the integral types on the atomic class template. For each integral type integral in the second column of table 121 or table 122, the specialization atomic<integral> shall be publicly derived from the corresponding atomic integral type in the first column of the table. These specializations shall have trivial default constructors and trivial destructors.

Table 121 does not include (atomic_bool, bool), so that this should probably be mentioned explicitly in the quoted paragraph.

[ Summit: ]

Move to open. Lawrence will draft a proposed resolution. Also, ask Howard to fix the title.

[ Post Summit Anthony provided proposed wording. ]

Proposed resolution:

Replace paragraph 3 in 29.5.3 [atomics.types.generic] with

-3- There are full specializations over the integral types on the atomic class template. For each integral type integral in the second column of table 121 or table 122, the specialization atomic<integral> shall be publicly derived from the corresponding atomic integral type in the first column of the table. In addition, the specialization atomic<bool> shall be publicly derived from atomic_bool. These specializations shall have trivial default constructors and trivial destructors.

Committee: Move to open. Lawrence will draft a proposed resolution. Also, ask Howard to fix the title.

Wording

29.5.3 Generic Types [atomic.types.generic]

Edit paragraph 3 as follows.

There are full specializations over the integral types on the atomic class template. For each integral type integral in the second column of table 118 or table 119, the specialization atomic<integral> shall be publicly derived from the corresponding atomic integral type in the first column of the table. These specializations shall have trivial default constructors and trivial destructors. In addition, the specialization atomic<bool> shall be publicly derived from atomic_bool. These specializations shall have trivial default constructors and trivial destructors.

Const Load

Issues

LWG 879: Atomic load const qualification

Status: Review

Submitter: Alexander Chemeris

Discussion: The atomic_address type and atomic<T*> specialization provide atomic updates to pointers. However, the current specification requires that the types pointer be to non-const objects. This restriction is unnecessary and unintended.

Proposed resolution: Add const qualification to the pointer values of the atomic_address and atomic<T*> specializations. E.g.


typedef struct atomic_address {
   void store(const void*, memory_order = memory_order_seq_cst) volatile;
   void* exchange( const void*, memory_order = memory_order_seq_cst) volatile;
   bool compare_exchange( const void*&, const void*,
                          memory_order, memory_order) volatile;
   bool compare_exchange( const void*&, const void*,
                          memory_order = memory_order_seq_cst ) volatile;
   void* operator=(const void*) volatile;
} atomic_address;

void atomic_store(volatile atomic_address*, const void*);
void atomic_store_explicit(volatile atomic_address*, const void*,
                          memory_order);
void* atomic_exchange(volatile atomic_address*, const void*);
void* atomic_exchange_explicit(volatile atomic_address*, const void*,
                              memory_order);
bool atomic_compare_exchange(volatile atomic_address*,
                            const void**, const void*);
bool atomic_compare_exchange_explicit(volatile atomic_address*,
                                     const void**, const void*,
                                     memory_order, memory_order);

Committee: Move to review. Lawrence will first check with Peter whether the current examples are sufficient, or whether they need to be expanded to include all cases.

Peter: I think that compare_exchange needs both void*& and const void*&, and both void** and const void**.


struct atomic_address
{
  bool compare_exchange( const void*& , const void* );
};

bool atomic_compare_exchange( volatile atomic_address*,
  const void**, const void* );

int main()
{
  atomic_address x;
  void * p;

  x.compare_exchange( p, 0 ); // error
  atomic_compare_exchange( &x, &p, 0 ); // error
}

Wording

29.5.2 Address Type [atomic.types.address]

Edit signatures as follows.


typedef struct atomic_address {
    void store(const void*, memory_order = memory_order_seq_cst) volatile;
    void* exchange( const void*, memory_order = memory_order_seq_cst) volatile;
    bool compare_exchange_weak( void*&, void*,
                                memory_order, memory_order) volatile;
    bool compare_exchange_strong( void*&, void*,
                                  memory_order, memory_order) volatile;
    bool compare_exchange_weak( void*&, void*,
                                memory_order = memory_order_seq_cst ) volatile;
    bool compare_exchange_strong( void*&, void*,
                                  memory_order = memory_order_seq_cst ) volatile;
    bool compare_exchange_weak( const void*&, const void*,
                                memory_order, memory_order) volatile;
    bool compare_exchange_strong( const void*&, const void*,
                                  memory_order, memory_order) volatile;
    bool compare_exchange_weak( const void*&, const void*,
                                memory_order = memory_order_seq_cst ) volatile;
    bool compare_exchange_strong( const void*&, const void*,
                                  memory_order = memory_order_seq_cst ) volatile;
    void* operator=(const void*) volatile;
} atomic_address;

Discussion: The atomic_exchange and atomic_exchange_explicit functions seem to be inconsistently missing parameters.

Proposed resolution:

Add the appropriate parameters. For example,


bool atomic_exchange(volatile atomic_bool*, bool);
bool atomic_exchange_explicit(volatile atomic_bool*, bool, memory_order);

Resolution:

Lawrence: Need to write up a list for Pete with details.

Wording

29.2 Header `<cstdatomic>` synopsis [atomic.syn]

Edit within the synopsis as follows.

....
bool atomic_exchange(volatile atomic_bool*, bool);
bool atomic_exchange_explicit(volatile atomic_bool*, bool, memory_order);
....
integral atomic_exchange(volatile atomic_itype*, integral);
integral atomic_exchange_explicit(volatile atomic_itype*, integral, memory_order);
....
void* atomic_exchange(volatile atomic_address*, void*);
void* atomic_exchange_explicit(volatile atomic_address*, void*, memory_order);
....

29.5.1 Integral Types [atomics.types.integral]

Edit within the synopsis as follows.

....
bool atomic_exchange(volatile atomic_bool*, bool);
bool atomic_exchange_explicit(volatile atomic_bool*, bool, memory_order);
....
integral atomic_exchange(volatile atomic_itype*, integral);
integral atomic_exchange_explicit(volatile atomic_itype*, integral, memory_order);
....

29.5.2 Address Type [atomics.types.address]

Edit within the synopsis as follows.

....
void* atomic_exchange(volatile atomic_address*, void*);
void* atomic_exchange_explicit(volatile atomic_address*, void*, memory_order);
....

More Collected Issues with Atomics

LWG 1043: Response to US 91

LWG 944: atomic<bool> derive from atomic_bool?

LWG 944: `atomic<bool>` derive from `atomic_bool`?