Clean up atomics, non-normative changes

Jens Gustedt, INRIA and ICube, France

2025-04-02

target

integration into IS ISO/IEC 9899:202y

document history

document number date comment
n2389 201906 non-normative parts of n1955, n2064 and n2329
proposal for atomic_fetch_OP_explicit resolution
n3523 202503 this document, rebase for C2y
removal of atomic_fetch_OP_explicit resolution
improve the integration into C terminology
extend to some terminology issues in Annex K
remove filler sentences
remove spurious double quotes

1 Introduction

The concurrent integration of atomics, threads and Annex K into the C standard had a lot of difficulties, such as the use of non-normative terminology (processes, address types, regular type, personification of actions, directories), missing integration of threads and atomic synchronization, misleading introductory references to implicated parts of the standard, missing integration between clause 6 and 7, or spurious claims such as

This paper builds on the following series of papers: n1955, n2064, n2329, and n2389.

In particular, there were the following votes on n2329 during the 2019 London meeting (see n2376).

As an answer to that, n2389 combined these two aspects, but then did not find consensus in the 2019 Ithaca meeting because of the atomic_fetch_OP_explicit functions.

This paper here so tries to address the aspects of non-normative changes that had been agreed upon during the London meeting and leaves the question of the atomic_fetch_OP_explicit functions open. Many of the identified problems have already been resolved through other paths and C23 now already has much less defects in this domain. So we may now concentrate on only a few issues from the above papers that are still left.

Additionally, we make a pass on the other material (mostly in non-normative text) that had been included for C11 but where proper integration between the different additions and with the standard terminology has been missed.

2 Wording changes

New text is underlined green, removed text is stroke-out red. Possible reorganization of the paragraphs is left to the discretion of the editors.

5.2.2.5 Multi-threaded executions and data races

This clause is particularly misleading, since synchronization operations are not limited to the library, and also within the library concerns much more that the indicated subclauses.

5 The library defines atomic operations (7.17) and operations on mutexes (7.29.4) that are specially identified as synchronization operations. There are operations that are specially identified as synchronization operations. If the implementation supports the atomics extension these are operators and generic functions that act on atomic objects (6.5 and 7.17). If the implementation supports the thread extension these are calls to initialization functions (7.24.1 and 7.29.2), memory management functions (7.24.4), operations on mutexes (7.29.3.5, 7.29.3.6 and 7.29.4), and calls to some thread functions (7.29.5.1, 7.29.5.5 and 7.29.5.6). These operations play a special role in making assignments side effects in one thread visible to another. A synchronization operation on one or more memory locations is one of an acquire operation, a release operation, both an acquire and release operation, or a consume operation. A synchronization operation without an associated memory location is a fence and can be either an acquire fence, a release fence, or both an acquire and release fence. In addition, there are relaxed atomic operations, which are not synchronization operations but still are indivisible and strictly ordered, and atomic read-modify-write operations, which have special characteristics. are those operations defined in 6.5 and 7.17 that act on an atomic object by reading its value, by performing an optional operation with that value and by storing back a value into that object.

11 Certain library calls operations synchronize with other library calls operations performed by another thread. In particular, an atomic operation A that performs a release operation on an object M synchronizes with an atomic operation B that performs an acquire operation on M and reads a value written by any side effect in the release sequence headed by A.

Remove the use of non-standard terminology and spurious quotes, and use direct language instead of double negation.

33 NOTE 16 This effectively disallows compiler reordering enforces the odering of atomic operations to a single object, even if both operations are relaxed loads. By doing so, it effectively makes the “cache coherence” guarantee provided by most hardware available to C atomic operations.

34 NOTE 17 The value observed by a load of an atomic object depends on the happens before relation, which in turn depends on the values observed by loads of atomic objects. The intended reading is that there exists an association of atomic loads with modifications they observe that, together with suitably chosen modification orders and the happens before relation derived as described previously, satisfy the resulting constraints as imposed here.

As defined here, a data race is not an event that fits into the happens before relation. So we can’t speak of a result of it.

35 The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. If a program execution contains a data race, the behavior is undefined.

Move to standard terminology and remove claims about “data-race-free” programs which is a term that is not introduced (only data-race-free program executions are).

36 NOTE 18 It can be shown that programs that correctly use simple mutexes operations on mtx_t and memory_order_seq_cst atomic operations to prevent all data races, and use no other synchronization operations, behave as though the operations executed by their constituent threads were simply interleaved, with each value computation of an object being the last value stored in that interleaving. This is normally referred to as sequential consistency. However, this applies only to data-race-free programs, and data-race-free programs cannot observe most program transformations that do not change single-threaded program semantics. In fact, most single-threaded Many program transformations that are valid in the absence of multiple threads continue to be allowed for sequentially consistent programs, since any execution of such a program that behaves differently as a result of such transformations necessarily has undefined behavior even before such a transformation is applied if the transformation were not applied.

What is a “compiler transformation”? What are “atomics in question”?

37 NOTE 19 Compiler Program transformations that introduce assignments to a potentially shared memory location that would not be modified by the abstract machine are generally precluded by this document, since such an assignment can overwrite another assignment by a different thread in cases in which an abstract machine execution would not have encountered a data race. This includes implementations of data member assignment that overwrite adjacent members in separate memory locations. Reordering of atomic loads in cases in which the atomics in question can atomic operands potentially alias is also generally precluded, since this can may violate the coherence requirements.

Remove some useless blabla, move to standard terminology.

38 NOTE 20 Transformations that introduce a speculative read of a potentially shared memory location possibly will not preserve the semantics of the program as defined in this document, since they potentially introduce a data race. However, they are typically may be valid in the context of an optimizing compiler that targets a specific machine a specific implementation with well-defined semantics for data races. They would be invalid for a hypothetical machine are invalid for an implementation that is not tolerant of data races or provides hardware data race detection.

6.2.6 Representations of types

We collect all information about atomic types and their operation here to have a unified text. Some phrases in individual clauses on operators, for example, may then be removed.

6.2.6.1 General

9 Loads and stores of objects with atomic types are done with memory_order_seq_cst semantics. If not specified otherwise, synchronizing operations on atomic objects have memory_order_seq_cst memory consistency.

9′ A synchronizing operation on an atomic object by itself never raises a signal, performs a trap, or results in any interruption of the control flow of the current thread.FNTa) For a synchronizing read-modify-write operation on an atomic object where the operation with identical values on the non-atomic type is erroneous,FNTb)
FNTa) Whether or not an atomic operation may be interrupted by a signal depends on the lock-free property of the underlying type.
FNTb) Such erroneous operations may for example incur arithmetic overflow, division by zero or negative shifts.
FNTc) Thus that object representation can be an invalid value for the type such as an invalid address (for pointer types) or can be a floating point NaN (for floating types).

6.5.3.5 Postfix increment and decrement operators

2 … Postfix ++ on an object with atomic type is a read-modify-write operation with memory_order_seq_cst memory order semantics.

6.5.17.3 Compound assignment

4 … If E1 has an atomic type, compound assignment is a read-modify-write operation with memory_order_seq_cst memory order semantics.

6.7.2.4 Atomic type specifiers

add an example at the end of the clause

5 EXAMPLE This disambiguation of the grammar is necessary in case a qualifier or specifier is followed by an opening parenthesis.
typedef double toto;

void ic(int const tutu);  // valid prototype, void g(int tutu)
void hc(int const(tutu)); // valid prototype, void g(int tutu)
void gc(int const(toto)); // valid prototype, void g(int(*)(double))

void ia(int _Atomic tutu);  // valid prototype, void g(int tutu)
void ha(int _Atomic(tutu)); // invalid prototype, tutu not a type for _Atomic()
void ga(int _Atomic(toto)); // invalid prototype, two type names in parameter declaration

Atomics <stdatomic.h>

7.17.1 Introduction

Atomics are not only relevant for threads but also for communication with signal handlers.

1 The header <stdatomic.h> defines several macros and declares several types and functions for performing atomic operations on data shared with signal handlers and between threads.302)

Replace some unredacted text from the original proposal by standardese.

8 NOTE Many operations are volatile-qualified. The “volatile as device register” semantics have not changed in the standard. This qualification means that volatility is preserved when applying these operations to volatile objects. Many of these type generic functions have volatile-qualified parameters to allow their application to volatile-qualified objects.

7.17.2.2 The atomic_init generic function

Be clear that atomic_init does not synchronize and avoid repetition.

3 Although this function initializes an atomic object, it does not avoid data races it is not a synchronizing operation; concurrent access to the object being initialized, even via an atomic operation, constitutes a data race.

7.17.3 Order and consistency

7.17.3.1 General

It is not necessary clear to the occasional reader what a “stronger memory_order” specifications (see 7.17.7.5) would be. Therefore, add a new note after p12 to provide words for a relation between different memory consistency models.

p12′ NOTE 2′ The memory orderings of memory_order impose different ordering constraints on certain operations. memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_acq_rel and memory_order_seq_cst form an inclusive chain of such constraints, from weakest to strongest. memory_order_release imposes constraints that are incompatible with memory_order_consume and memory_order_acquire, and that are stronger than memory_order_relaxed and weaker than memory_order_acq_rel.

7.17.5 Lock-free property

7.17.5.1 General

1 The atomic lock-free macros indicate the lock-free property of integer and address atomic pointer types. A value of 0 indicates that the type is never lock-free; a value of 1 indicates that the type is sometimes lock-free; a value of 2 indicates that the type is always lock-free.

2 NOTE In addition to the synchronization properties between threads, the lock-free property of a type warrants that operations are perceived indivisible and strictly ordered in the presence of signals, see 5.2.2.4.

Recommended practice

Operations that are lock-free should also be address-free. That is, atomic operations on the same memory storage location via two different addresses will communicate atomically synchronize (for a memory order other than relaxed) and be indivisible and strictly ordered. The implementation should not depend on any per-process execution specific state. This restriction enables communication via memory mapped into a process more than once and memory shared between two processes. synchronization via storage that is mapped into an execution more than once and storage that is shared between concurrent program executions

7.17.6 Atomic integer types

Recommended practice

3 The representation of an atomic integer type is not required to have the same size as the corresponding regular type non-atomic version of the direct type but it should have the same size whenever possible, as it eases effort required to port existing code.

7.17.7.5 The atomic_compare_exchange generic functions

Description

The failure argument shall not be memory_order_release nor memory_order_acq_rel. The failure argument shall be no stronger not impose more constraints on the operation than the success argument.

7.17.7.6 The atomic_fetch and modify generic functions

Description

Atomically replaces the value pointed to by object with the result of the computation applied to the value pointed to by object and the given operand. Memory is affected according to the value of order. These operations are atomic read-modify-write operations (5.2.2.5). For signed integer types, arithmetic performs silent wraparound on integer overflow; there are no undefined results. For address types, the result may be an undefined address, but the operations otherwise have no undefined behavior.

7.17.8 Atomic flag type and operations

17.17.8.1 General

1 The atomic_flag type provides the classic test-and-set functionality. It has an atomic data primitive that has exactly two states, set and clear.

2 …

3 NOTE Hence, as per 7.17.5, the operations should also be address-free. No other type requires lock-free operations, so the atomic_flag type is the minimum hardware-implemented type needed to conform to this document that is asynchronous signal safe and that is expected to be compatible with implementation-specific extensions for shared objects between different program executions. The remaining types can be emulated with atomic_flag, though with less than ideal properties.

7.30.3.1 General

386) This does not mean that these functions are forbidden to read global state that describes the time and calendar settings of the execution, such as the LC_TIME locale or the implementation-defined specification of the local time zone. Only the setting of that state by setlocale or by means of implementation-defined functions can constitute data races.

K .3.5.2 Operations on files

K .3.5.2.1 The tmpfile_s function

Recommended practice

(note to the editors: the paragraph number is missing, similar changes should also be applied to the clause of tmpfile)

5′

It should be is possible to open at least TMP_MAX_S temporary files during the lifetime of the program program execution (this limit can be shared with tmpnam_s) and there should be no limit on the number simultaneously open other than this limit and any limit on the number of open files (FOPEN_MAX).

K .3.5.2.2 The tmpnam_s function

Recommended practice

People don’t create files, concurrent program executions do. Race condition is not an introduced term. Directories are not a concept that we can refer to in this standard.

7 After a program execution obtains a file name using the tmpnam_s function and before the program execution creates a file with that name, the possibility exists that someone else can create a concurrent program execution creates a file with that same name. To avoid this race condition, the tmpfile_s function should be used instead of tmpnam_s when possible. One situation that requires the use of the tmpnam_s function is when the program needs to create a temporary directory rather than a temporary file.

8 Implementations should take care in choosing the patterns used for names returned by tmpnam_s. For example, making a thread ID part of the names avoids the race condition and possible conflict when that multiple programs and threads run simultaneously by the same user concurrently and generate the same temporary file names.