==========================================================

Document: N2220

Related: N2223: Clarifying the C Memory Object Model: Introduction to N2219 - N2222, N2091, Section 1 of N2012, Question 2/15 of our survey, Section 3.1 (Q47-48) of our N2013, and DR338.

1 Summary

This document revises N2091, which itself was based on N2012 (Section 1), adding a concrete Technical Corrigendum proposal for discussion and revising the text.

1.1 Problem

In ISO C11 (following C99) trap representations are particular object representations that do not represent values of the object type, for which merely reading a trap representation (except by an lvalue of character type), is undefined behaviour. See 3.19.4, 6.2.6.1p5, 6.2.6.2p2, DR338. An "indeterminate value" is either a trap representation or an unspecified value.

Trap representations complicate the language: misconceptions and misunderstandings about trap reprepresentations and their relationship to unspecified values seem common, e.g. with confusion between the ISO notion that trap representations give UB when read and the idea that they give machine traps when read, confusion with the quite different Itanium NaT concept, and with some believing that object types that do not have any unused representations should nonetheless be regarded as potentially having trap representations. They create the possibility for subtle bugs, e.g. if a programmer inadvertently constructs a trap representation and the resulting (unbounded) undefined behaviour is exploited by some unexpected compiler optimisation. It is not clear how they are significant they are in practice for current C implementations:

1.2 Outline of proposal

We see two options here for the next major version of C.

Ideally we think the concept of trap representation should be removed entirely, following Option (a) below; this requires adapting the treatment of _Bool somewhat, exploiting the unspecified-value semantics. If that is not feasible (e.g. because of unchecked computed branch tables involving _Bool, or NaN issues, or exotic architectures), we propose to keep trap representations but require them be implementation-defined, Option (b).

2 Proposed Technical Corrigendum

In either case we suggest removing the 6.3.2.1p2 clause, to make uninitialised reads (of non-trap-representations for (b)) defined behaviour irrespective of whether the address of the read variable is taken.

2.1 Option (a): modelling non {true,false} _Bool representations with unspecified values

To replace the use of trap representations for non-{true,false} _Bool values, by unspecified values for the result of operations on such values, one could make a change as below. Such values are converted (by the integer promotion rules) to other integer types before they are operated on, so the unspecified value can be introduced just at the conversion point (and then propagated as in our N2221 proposal by the operations).

Extend 6.3.1.2 Boolean type from

1 When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1.59)

to:

1 When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1.59) When a value of _Bool type is converted to any other scalar type, if the value is not 0 or 1 the result is an unspecified value.

Note that if one were also making control-flow choices based on unspecified values be undefined behaviour (a separate semantic choice, Q50 of N2221), this would also make unchecked _Bool computed branch tables a sound implementation technique.

Removing trap representations entirely would also let one remove the concept of indeterminate value.

2.2 Option (b): making Trap Representations implementation-defined

Here we suggest making the sets of trap representation values for each type be implementation-defined, thereby requiring implementations to document which representations are trap representations - and hence, in the common case that there are none for non-_Bool integer types or for pointer types, to document that. That will simplify the task of reasoning about C programs in that case, removing the uncertainty about whether an implementation might be treating some representation values as trap representations.

5 Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.50) Such a representation is called a trap representation.

to read

5 Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.50) Such a representation is called a trap representation. The set of trap representations for each object type is an implementation-defined set.