2023-01-08
org: | ISO/IEC JCT1/SC22/WG14 | document: | N3079 | |
target: | IS 9899:2023 | version: | 1 | |
date: | 2023-01-08 | license: | CC BY |
We only deal with the second part of that comment that addresses type inherence.
n2953 Type inference for object definitions / Under specified Types.
We believe that the proposal, that allows the use of auto to infer the type from initialization, will have a detrimental effect on safety and reliability if that chooses to use it. In C different types can have very different behaviors. The most obvious case is if the different ways signed and unsigned integers handle overflow. Not explicitly stating the type of but to rather rely on an implicit assumption, is in our view a serious risk.
Consider the following:
auto limit = MY_LIMIT;
if(limit + add < limit) /* overflow protection */
return;
The user here assumes that
MY_LIMIT
is anunsigned int
. Since theunsigned int
has a well defined overflow behavior, the overflow test is entirely valid. However, should theMY_LIMIT
define be asigned int
, then overflow is undefined, and a compiler may choose to optimize away the overflow protection. The implementation will issue no warning if the assumption is wrong. If instead the user would have explicitly declared that they expect a limit to be, say anunsigned int
, then the code would not break, and ifMY_LIMIT
would be out of range of theunsigned int
, the implementation can issue a clear diagnostic.
For this reason we believe that this functionality will be discurraged by organizations like MISRA, and various style guides, since it actively prevents compilers and other diagnostic tools from verifying that the actions of the program matches the programmers intentions. We believe that the wg14 should not introduce new functionality whose use is likely to be discouraged from a safety and reliability perspective.
While we believe this to be functionality to be adopted by a very small minority of C programmers. This however leaves the possibility of accidental use, when a user forgets to specify a type properly and the implementation is no longer able to issue a diagnostic, because it is forced to assume that the omission is intentional.
This comment shows quite a narrow perspective on the feature. Thereby it is not very well aligned with the use that the feature already has in the field, nor does it even discuss the primary use cases for which it was designed and why it was integrated into the current proposal for C23, namely type-generic programming.
Security concerns are of course valid concerns, and they had been discussed during the adoption phase for the feature. One of the proponents of the feature, Alex Gilding, is much implicated in MISRA and will almost certainly write a proposal that bans the use of auto
in the presented form, there. But it is not appropriate to assume that a feature that is not be suited for one part of our community would not be well suited and appropriate for other parts.
Although we suspect that this is not the whole reason for this strong allergic reaction, on the surface the only technical objection that is clearly issued here is that this feature reuses the keyword auto
for this.
We had proposed to use either auto
or __auto_type
(the current implementation in gcc) for this and WG14 went clearly for auto
. This was mainly to have a better cross-language compatibility with C++. We do not think that the comment gives any new argument to question the consensus that had been found in WG14.
The specification of linkage for file-scope objects fails to cover the case of objects declared as
thread_local
withoutstatic
orextern
.
Change “no storage-class specifier or only the specifier
auto
” to “does not contain the storage-class specifierstatic
orconstexpr
”.
6.2.2 Linkages of identifiers
…
5 If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier
extern
. If the declaration of an identifier for an object has file scope andno storage-class specifier or only the specifierdoes not contain the storage-class specifiersauto
static
orconstexpr
, its linkage is external.
(note the editorial change “specifier” → “specifiers”)
As we’ve been working on implementing this functionality in Clang, we’re finding that the specification differences between C and C++ are a significant source of consternation for us. In C++,
auto
is a type specifier. In C,auto
is not a type, it’s the absence of a type and the use of a storage class specifier.
Please do not resurrect implicit
int
with different semantics, but define this as a type specifier. Logically, a deduced type is a type and not a class of storage.
The keyword auto
already has a meaning in C and that use has never been deprecated. Therefore a removal of that functionality would be a direct violation of WG14’s policy. So seen like that, if we stick to auto
as the keyword for this feature, it can always be read as giving new semantics to type omission.
Historically, this feature was implemented by gcc with __auto_type
to mark exactly the difference between a storage class and a type specification. It was actually clang’s choice to map this feature to the auto
feature from C++ and that thereby eliminated the conceptual difference between the two.
The alternative would be to consequently use __auto_type
and to constrain (not only restrict) its use to the grammar as implemented by gcc. WG14 was not in favor of that and we do not see new data here that would warrant to reconsider that decision.
As a consequence, we don’t know what an appropriate action would be to accommodate these concerns.
Introduces incompatible semantics with C++ regarding the following example (undefined behavior in C, accepted in C++):
int i;
auto good = &i;
auto *bad = &i; // Cannot specify the pointer
This style is often a coding standard requirement for code bases in C++ due to the improved code readability: e.g., https://llvm.org/docs/CodingStandards.html#beware-unnecessary-copies-with-auto.
We prefer that it be required to support (optional) pointer and array declarators as part of a deduced type.
First, the introductory phrase of this NB comment is not correct, the semantics are not incompatible here, but one standard defines semantics (C++) whereas the other doesn’t (C).
Nothing prohibits implementations to extend the C semantics towards the C++ semantics. In fact, it has been an explicit choice to make this undefined in C (and not a constraint) such that existing implementations (such as clang) would not have to change.
The reason why this semantic restriction exists in C23 is internal. In C++ type-inferrence was well-established before the introduction of this feature (namely by overloading and templates) and the definition actually uses rules that were designed for templates, there. We were not able to come up with text that would have covered the C++ semantics well enough and decided to first with the restricted version as implemented in gcc as __auto_type
that was doable in the context of terminology that pre-existed in C.
N3079 presents an approach that mostly follows C++ and for which we are convinced that it should be integrated into the standard at some point. Below we provide a less complex alternative that keeps the status quo and improves the text where it seemed possible.
Adds a constraint on programs under a “Description” heading which makes it a bit less clear as to how to interpret the “shall” clauses used. For example, is this code UB or is it simply not possible to write:
auto a = { 1, 2 };
If it’s UB, an implementation could elect to deduce
a
asint[3]
orint *
and I don’t think we want to allow an extension into that space (FWIW, in C++ that would deduce to a std::initializer_list).
Clarify the intent by moving the specification either to a Constraints or Semantics section, or remove use of the word “shall”
“Shall” outside of constrains sections always indicated UB, see Clause 4 p2. But indeed, the use of “Description” as a heading falls out of line in the context where this text is located. It would better have been “Semantics”. We apologize for this mistake.
It also seems that the syntax using braced initializers is not compatible with the corresponding construct in C++, since there braced initializers for auto
declarations always indicates a inferred array type.
Note that a resolution that mostly implements the C++ approach is presented in N3079.
As an accommodation for these concerns there would be the possibility to move from UB to implementation-defined behavior and to recommend that implementations stick to reasonable semantics as are already provided by the corresponding feature in C++.
6.7.9 Type inference
Constraints
1 A declaration for which the type is inferred shall contain the storage-class specifier
auto
.
DescriptionSemantics
2 For such a declaration that is the definition of an object the init-declarator shall have
one of the formsthe form
direct-declarator = assignment-expression
direct-declarator = { assignment-expression }
direct-declarator = { assignment-expression , }
The
declaredinferred type of the declared object is the type of the assignment expression after lvalue, array to pointer or function to pointer conversion, additionally qualified by qualifiers and amended by attributes as they appear in the declaration specifiers, if any176).If theImplementations need not accept a direct declarator that is not of the form
identifier attribute-specifier-sequenceopt
optionally enclosed in balanced pairs of parentheses
, the behavior is undefined; if a direct declarator of a different form is accepted, the behavior is implementation-definedFNT).
FNT) It is recommendation that implementations that accept different forms of direct declarators follow the syntax and semantics of the corresponding feature in ISO 14882.
Add to the bibliography (and not to the normative references!)
Programming languages — C++, ISO/IEC IS 14882