constexpr2024-05-05
| document number | date | comment | 
|---|---|---|
| n3253 | 202405 | this paper, original proposal | 
CC BY, see https://creativecommons.org/licenses/by/4.0
C23 has introduced several new features for which the text for inline functions has not been properly updated, yet. This concerns:
typeof or similar expression should be able to use identifiers with static storage duration, even in inline functions.constexpr. These refer to objects with internal linkage.In particular the latter already have lead to diverging practice
clang accepts it, gcc refuses. The problem here is that we made unfug to have internal linkage, and 6.7.5 p3 states
An inline definition of a function with external linkage shall not contain, … anywhere in the tokens making up the function definition, a reference to an identifier with internal linkage.
Note that this uses the vague term “reference to an identifier” instead of simply using “an identifier” which would be more appropriate when discussing token sequences. This strange terminology could perhaps be interpreted of wanting to say “taking a reference of an identifier”, in which case the interpretation of clang would be correct, and the diagnostic of gcc would be overprotective.
In that existing text, it is important to note that this talks of usages of the identifier (with internal linkage) and not about the use of the underlying object (with static storage duration) or function. A use as in the following
constexpr unsigned unfug = 1;
extern unsigned const*const my_copy;
inline unsigned getit(void) {
       return *my_copy;
}is valid (and should remain so) even if the instantiation then has
Here the access of the pointer value goes through a pointer object with external linkage and all copies of the inline function will see the same pointer value and will use the same instance of the constexpr object.
On the other hand we think that the following use of constexpr objects should be prohibited
Here, each translation unit would have a separate instance of the unfug object, and thus gotit returns a different value for each translation unit and the semantics differ.
The new constexpr feature also puts the finger on another problem that is currently not addressed by the C standard at all:
Inline definitions with the same name in different translation units could have different semantics.
This could for example happen simply because the code plainly uses different programming text (e.g include files) for the different TU. But even if we would impose that inline definitions with the same name are always composed of the same token sequence, identifiers that are used in such a sequence could refer to different features. Such a different “interpretation” of the inline definition could for example happen because some feature macros or enumeration constants are defined differently for the compilation of two separate TU (already possible in C17) or because constexpr objects have different values (new in C23).
Still, the C standard in several places talks about the function and not about the functions so it could be argued that differences in semantics between different inline definitions have undefined behavior by omission.
In any case, this is an inherently dangerous property and it might perhaps be time to state the intent that divergence in semantics of inline definitions in different TU is not intended. As far as we can see, the only possibility here is to make the behavior of a program that has diverging inline definitions undefined. We propose to do that constructively, by imposing that all inline definitions already agree on a token level.
When it comes to semantic differences between an inline definition and the (unique) external definition, the situation is even less clear. At the beginning when gcc introduced their inline feature it was even advertised as a feature that inline definition and external definition could be distinct.
Currently, the C standard only has (at the end of 6.7.5 p7)
It is unspecified whether a call to the function uses the inline definition or the external definition.
So if the inline definition or external definition have different semantics, the program has indeterminate behavior.
We don’t think that this is a good choice because nevertheless these identifiers have external linkage and should be considered to have the same semantics across the whole program.
So we think that we should also mark such divergence for inline functions with external linkage as undesirable. If we’d make that UB, implementations that want to continue to provide this possibility to their customers could still do so by extension.
Removals are in stroke-out red, additions are in underlined green.
Change 6.7.5 p3 and add two footnotes as follows
An inline definition of a function with external linkage shall not contain, anywhere in the tokens making up the function definition,
- a definition of a modifiable object (including a compound literal) with static or thread storage duration,
and shall not contain, anywhere in the tokens making up the function definition,
a reference toan evaluation of an identifier with internal linkage.that is not declared withconstexpr,- an expression that uses an identifier with internal linkage to evaluate the address1) of the underlying feature.2)
1) Expression that evaluate the address of the underlying feature include the address, array subscripting and function call operators.
2) These constraints are intended to ensure that inline definitions that appear in different translation units but are made up from the same tokens have the same semantics.
If the answer to question 2. is affirmative, add a new paragraph after 6.7.5 p7
All inline definitions with the same external name in different translation units that constitute the program shall be made up of the same token sequence, only discarding possible changes in white space. Any identifier that is used in that token sequence shall be in the same name space (label, tag, member, attribute or ordinary) and refer to compatible features. In particular, all used named constants that are declared in file-scope with the same name shall have compatible type and shall have the same value in all translation units.
Additionally, if the answer to question 3. is affirmative, add to that new paragraph:
All inline definitions and an external definition with the same external name in different translation units that constitute the program shall be made up of the same token sequence, only discarding possible changes in white space. Any identifier that is used in that token sequence shall be in the same name space (label, tag, member, attribute or ordinary) and refer to compatible features. In particular, all used named constants that are declared in file-scope with the same name shall have compatible type and shall have the same value in all translation units.
or, alternatively if the answer to question 4. is affirmative, add a sentence at the end of that new paragraph
It is implementation-defined if a similar property for inline definitions and the external definition of the same function holds.