ISO/IEC JTC1 SC22 WG21 N3045 = 10-0035 - 2010-02-15
Paul E. McKenney, paulmck@linux.vnet.ibm.com
Mark Batty, mjb220@cl.cam.ac.uk
Clark Nelson, clark.nelson@intel.com
N.M. Maclaren, nmm1@cam.ac.uk
Hans Boehm, hans.boehm@hp.com
Anthony Williams, anthony@justsoftwaresolutions.co.uk
Peter Dimov, pdimov@mmltd.net
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
Mark Batty recently undertook a partial formalization of the C++ memory model, which Mark summarized in N2955. This paper summarizes the discussions on Mark's paper, both verbal and email, recommending appropriate actions. We expect that this working paper will be divided into a group of issues to be applied to the working draft.
The phrase “might be” is indefinite and should be reworded.
Replace “might be” with “is”:
Priority: Low.
N2955 suggests two changes to 1.10p14:
Recommendation: no change.
memory_order_seq_cst
operations are also used outside
of lock-based critical sections, the result is still simple
interleaving.
If atomic memory_order_seq_cst
operations are also used both
inside and outside of lock-based critical sections, the
result is still sequentially consistent, but the individual
lock-based critical sections are no longer simply interleaved.
However, the result will be consistent with at least one simple
interleaving of
the individual operations making up each critical section.
Recommendation: update note to include atomic
memory_order_seq_cst
, reworking the wording
appropriately.
Reword the non-normative noted in 1.10p14 to include sequentially consistent atomic operations as well as lock-based critical sections, as follows:
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. [ Note: It can be shown that programs that correctly use simple locks and
memory_order_cst
operations to prevent all data races and that use no other synchronization operations behave asthe executions ofif the operations executed by their constituent threadswereare simply interleaved, with eachobserved valuevalue computation of an object being thelast value assignedlast side effect on that object in that interleaving. This is normally referred to as “sequential consistency”. However, this applies only to race-free programs, and race-free programs cannot observe most program transformations that do not change singlethreaded program semantics. In fact, most single-threaded program transformations continue to be allowed, since any program that behaves differently as a result must perform an undefined operation. — end note ]
Priority: Medium.
Atomic and locking objects are not trivially copyable [29.5.1p2, 29.5.2p1,
29.5.3p2], so the result of copying them (for example,
via std::memcpy
)
are not specified by the standard [3.9].
Additionally, if the memcpy
operation results in a data race, then undefined behavior is explicitly
specified by the working draft [1.10p14].
There was some spirited discussion of the non-data-race case on the email reflector, with the following positions outlined:
memory_order_relaxed
,
and that there are a number of
situations (including some implementations of resizeable
hash tables) where most accesses to a given object are not
subject to data races.
In such cases, there is good reason to avoid
memory_order_relaxed
's
overhead for accesses known to be data-race free.
memcpy
)
was at best unspecified, at worst undefined.
memory_order_relaxed
load, so
it is not necessary to define the effect of copying the
underlying representation.
Furthermore, the effect of copying an underlying representation
to an atomic object can be both safely and efficiently emulated
via a memory_order_relaxed
store for machine-word-sized
accesses, which are the most common in practice.
memory_order
enum
member that would permit the implementation to access the
atomic object non-atomically (for the purposes of this paper,
call it memory_order_nonatomic
).
This could be thought of as specifying memory ordering that is
so relaxed that the implementation need not even guarantee
indivisibility of different accesses to the same
atomic object.
A memory_order_nonatomic
operation would therefore be subject
to data races.
Therefore, this paper recommends no changes to 1.10p4.
This paper does not recommend adding memory_order_nonatomic
to c++0x, but something similar should be considered for a later TR
or a later version of the standard.
Priority: N/A.
The phrase “M is a maximal contiguous” could be interpreted as meaning the sequence having the maximum value or any of a number of alternative interpretations. However, there were other instances of this abbreviation that were not objected to, so recommend no change.
Priority: N/A.
The intent of this paragraph is that initialization be considered a separate access, but this is not explicitly stated. There is some debate as to whether this needs to be explicitly stated. In absence of consensus, let those who read the words of this paragraph apply appropriate common sense.
Priority: N/A.
As with 1.10p12, the intent of this paragraph is that initialization be considered a separate access, but this is not explicitly stated. There is some debate as to whether this needs to be explicitly stated. In absence of consensus, let those who read the words of this paragraph apply appropriate common sense.
Priority: N/A.
Add a note stating that memory_order_relaxed
operations
must maintain indivisibility, as described in the discussion of 1.10p4.
This must be considered in conjunction with the resolution to LWG 1151,
which is expected to be addressed by Hans Boehm in N3040.
Add a note as follows:
The enumeration
memory_order
specifies the detailed regular (non-atomic) memory synchronization order as defined in 1.10 and may provide for operation ordering. Its enumerated values and their meanings are as follows:— memory_order_relaxed: no operation orders memory.
—memory_order_release
,memory_order_acq_rel
, andmemory_order_seq_cst
: a store operation performs a release operation on the affected memory location.
—memory_order_consume
: a load operation performs a consume operation on the affected memory location.
—memory_order_acquire
,memory_order_acq_rel
, andmemory_order_seq_cst
: a load operation performs an acquire operation on the affected memory location.[ Note: Atomic operations specifying
memory_order_relaxed
are relaxed only with respect to memory ordering. Implementations must still guarantee that any given atomic access to a particular atomic object be indivisible with respect to all other atomic accesses to that object. — end note. ]
Priority: Low.
The second sentence of this paragraph, “Implementations shall not move an atomic operation out of an unbounded loop”, does not add anything to the first sentence, and, worse, can be interpreted as restricting the meaning of the first sentence. This sentence should therefore be deleted. The Library Working Group discussed this change during the Santa Cruz meeting in October 2009, and agreed with this deletion.
Therefore, remove the second sentence of 29.3p9 as follows:
Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.
Implementations shall not move an atomic operation out of an unbounded loop.
Priority: Medium.
This topic was the subject of a spirited discussion among a subset of the participants in the C/C++-compatibility effort this past October and November.
Unlike C++, C has no mechanism to force a given variable to be initialized. Therefore, if C++ atomics are going to be compatible with those of C, either C++ needs to tolerate uninitialized atomic objects, or C needs to require that all atomic objects be initialized. There are a number of cases to consider:
={value}
” syntax may be used
to explicitly initialize these values, however, such initialization
may not contain any statements executing at run time.
auto
variables. The C standard does not
require that these be initialized.
On some machines, such variables might be
initialized to an error value (for example, not-a-thing (NAT)
for variables on Itanium that live only in a machine register).
The C “={value}
” syntax may be used
to explicitly initialize these values, and may include
statements executing at run time.
malloc()
.
The C standard does not require that these be initialized.
The C “={value}
” syntax may not be
used to explicitly initialize these values.
Of course, C on-stack auto
variables and dynamically
allocated variables are inaccessible to other threads until references
to them are published.
Such publication must ensure that any initialization happens before any
access to the variable from another thread, for example, by use of
store release or locking.
There are also a number of interesting constraints on these types:
These constraints permit but three known ways for C++ to make use of non-generic atomic types defined in C-language translation units:
auto
variables.
The wording below permits any of the above implementation alternatives.
Add the following to WG21 29.5.1 (Integral Types) in locations
corresponding to the existing atomic_is_lock_free()
functions:
atomic_bool ATOMIC_VAR_INIT(bool);
void atomic_init(volatile atomic_bool*, bool);
void atomic_init(atomic_bool*, bool);
atomic_itype ATOMIC_VAR_INIT(itype);
void atomic_init(volatile atomic_itype*, itype);
void atomic_init(atomic_itype*, itype);
Note that ATOMIC_INIT
is already in use, for example, in
the Linux kernel.
Google code search was unable to find ATOMIC_VAR_INIT
or
atomic_init
.
Add the following to WG21 29.5.2 (Address Type) located
corresponding to the existing atomic_is_lock_free()
function:
atomic_address ATOMIC_VAR_INIT(void *);
void atomic_init(volatile atomic_address*, void *);
void atomic_init(atomic_address*, void *);
Add the following after WG21 29.6p4 (Operations on Atomic Types):
ATOMIC_VAR_INIT(x);
A macro expanding to a token sequence suitable for initializing an atomic variable of a type that is the atomic equivalent of the type of x. Concurrent access to the variable being initialized, even via an atomic operation, constitutes a data race.
[ Example:
atomic_int v = ATOMIC_VAR_INIT(5);
— end example ]
Add the following after WG21 29.6p5 (Operations on Atomic Types):
void atomic_init(volatile A *object, C desired);
void atomic_init(A *object, C desired);
Effects: Non-atomically assigns the value desired to object. Concurrent access from another thread, even via an atomic operation, constitutes a data race.
In addition, WG14's C-language working draft requires initializers for non-flag atomic types (initialization is already provided in the C++ working draft via constructors). These are listed below for convenience, but will need to be the subject of a later WG14 paper.
Change WG14 7.16.1p1 as follows:
The header <stdatomic.h> defines
threefour macros and declares several types and functions for performing atomic operations on data shared between threads.
Change WG14 7.16.1p2 as follows:
The macros defined are
ATOMIC_INTEGRAL_LOCK_FREE
ATOMIC_ADDRESS_LOCK_FREE
which indicate the general lock-free property of integer and address atomic types; and
ATOMIC_FLAG_INIT
ATOMIC_VAR_INIT
atomic_init
which expands to an initializer for an object of type atomic_flag.which expands to an initializer of an atomic type and and to an execution-time initializer for an atomic type, respectively.
Add a new section to WG14 named “Initialization”:
7.16.N Initialization
The macro
ATOMIC_VAR_INIT
may be used to initialize an atomic variable declaration, however, the default zero-initialization is guaranteed to produce a valid object where it applies.EXAMPLE
atomic_int guide = ATOMIC_VAR_INIT(42);
The macro
atomic_init
may be used to initialize an atomic variable at execution time, for example, for atomic variables that have been dynamically allocated.EXAMPLE
atomic_init(&p->a, 42);
An atomic variable that is not explicitly initialized with
ATOMIC_VAR_INIT
is initially in an indeterminate state.
Delete WG14 7.16.7p4:
The macroATOMIC_FLAG_INIT
may be used to initialize an atomic_flag to the clear state. An atomic_flag that is not explicitly initialized withATOMIC_FLAG_INIT
is initially in an indeterminate state.
EXAMPLE
atomic_flag guard = ATOMIC_FLAG_INIT;
Priority: Medium.