Proposal for C2y
WG14 3339

Title:               auto as a placeholder type specifier, v2
Author, affiliation: Alex Celeste, Perforce
                     Aaron Ballman, Intel
Date:                2024-09-08
Proposal category:   Feature enhancement
Target audience:     Compiler implementers

Abstract

Implementation experience in leads maintainers to request two main changes to the way the auto specifier works: it should become a type specifier rather than remaining a storage class specifier, because it does not describe a storage class any more; and it should permit partial deduction of the type of the initializing value by allowing derived types to derive from the placeholder it describes.

Both of these changes should improve compatibility with C++, the first mostly for implementers and readers of the text; the second to better conform to common user expectations about how the feature should be available.

This proposal originated as a response to US National Body Comments 121, 122 and 123 against the CD ballot draft of the C23 Standard.


auto as a placeholder type specifier, v2

Reply-to:     Alex Celeste (aceleste@perforce.com)
Document No:  N3339
Revises:      N3076
Date:         2024-09-08

Summary of Changes

N3339

N3076

N3007

Introduction

The adopted specification for auto aimed to meet two goals:

auto can therefore be combined with other type specifiers, but as it is currently provided it is not completely compatible with the C++ feature. This is because the C++ feature specified in C++23/[dcl.type.auto.deduct] relies on wording from the definition of templates (C++23/[temp.deduct.call]), which are completely missing from C; an equivalent feature needs to be reconstructed instead of using a common specification.

However, the missing specifications also leave the C feature far more limited in scope, so reconstructing only the parts relevant to object definitions is not as difficult as initially feared.

Additionally, changing the definition of auto from a storage class to a type specifier is separately possible and comparatively easy. This is a prerequisite task for partial deduction which can be integrated separately.

Proposal

auto as a type specifier

US NB Comment 121 against the C23 CD ballot requested that auto be made into a type specifier rather than being left as a storage class. Treating it as a storage class seems to make unnecessary complexity for implementers and introduces incompatibility with C++. It leaves in the ambiguity of declarations with no type at all still needing one to be implicitly added by the compiler (implicit int becoming implicitly-deduced, but still syntactically implicit), makes the specification more complicated, and serves no purpose because the specifier does not describe a storage class (it can be used with static and thread storage durations).

Changing the specification to refer to a type specifier simplifies all of this, and is also simpler for implementers, who can choose to interpret the keyword as one of two internally-distinct specifiers depending on the language mode, rather than always interpreting it as a storage class specifier with different effects (i.e. one grammar can contain two different syntactic auto side-by-side, resolved at keyword-translation time). This is interesting to developers also shipping combined C++ compilers who would like to be able to reuse as much of their internals as possible across both language frontends.

We propose that the language introduce the term “placeholder type specifier” as found in C++. A placeholder type specifier is simply a syntactic type specifier that has some additional constraints on where it may appear (i.e. not in a type-name). This is an italicized term of art.

This would change all references to auto as a storage class specifier to refer to it as a (placeholder) type specifier, and all references to "declarations that omit a type" (of which there are now none), to "declarations that contain a placeholder type specifier". We allow the placeholder type specifier to form part of any type specifier multiset, where it is simply ignored if it is not the only component of the type specifier, which subsumes usages like auto int x = 10; by making the type auto int (in which auto is ignored, and therefore the specified type is just int).

Support pointer and array declarators

US NB comment 122 requested that C explicitly require support for pointer and array declarators.

Using explicit pointer declarators is common practice in C++, and users would be surprised not to find it in C as well. C++ also supports deduction of function return types and array element types in pointer-to-function and pointer-to-array declarators (but not of function parameter types).

Note that C++ deduces the type of a braced-initializer to be a std::initializer_list, not an array. The syntax auto a[] = { ... }; is therefore not valid in C++. However, lacking this library feature, C users will expect arrays to substitute for it; and in any case there is no additional complexity in allowing this form so long as we do not bother with complicated element coercion rules, and instead require the elements to be explicitly compatible. (It would actually introduce more wording to disallow this.)

This form was explicitly requested by the US NB so it is included here. The comments from Intel were not concerned about a C++ compatibility issue here because definitions are mostly at block scope, not at file scope where they might appear in shared headers.

Therefore, building on the previous resolution, we propose to allow derived types to be constructed from the placeholder type, so long as the sequence of type and declarator derivations matches the same outermost sequence of derivations in the type of the initializer expression.

This trivially and elegantly allows the following forms:

auto * p1 = ...
auto p2[] = { ... }
auto (*p3)[3] = ...;
auto (*p4)(void) = ...;

and disallows

atomic (auto) a = …;
int (*f)(auto) = …;

Note that while the atomic derived type is not allowed because it uses auto as a type-name, atomically-qualifying the declared object is perfectly OK. This is a rare difference in the behaviour of the atomic specifier and qualifier.

This serves to remove the implementation-defined behaviour in 6.7.10 "Type inference", and the need for associated footnote 164.

Impact

As integrated into C23, the auto specifier was intended to be an exact standardization of the GNU C __auto_type specifier, renamed to look like the C++ keyword. This has the advantage of being completely backed by long-established practice.

Some compatibility changes did sneak into the implementation of the tools, which were not intended by the original proposal:

void f (int x) {
  auto s1 = (struct { int y; }){ .y = x };  // GCC error, Clang OK

  __auto_type s2 = (struct { int z; }){ .z = x }; // both OK
}

Whether to accept the first declaration is currently implementation-defined.

This proposal significantly extends the feature to more closely match it as it appears in C++. This is not invention, but it is also not the exact GNU C feature we initially promised to standardize. This proposal is however the direct result of implementation experience in Clang of the feature as it was standardized into C23.

Feedback from implementers seemed to indicate that being more generous with the syntax to allow additional C++-ish forms (i.e. auto * px = ...) is preferable to sticking to exact C dialect practice, because the C++ practice is more useful to them.

Implementation experience with the different requirements of the C and C++ features also led Clang and QAC developers to conclude that a type-specifier based implementation would be simpler than a storage-class based implementation. Although this seems like it should be a private implementation detail, for a tool like Clang this distinction is important because of the way it makes result data available, tied to the exact syntax productions of the user code.

Proposed wording

The proposed changes are based on the latest public draft of C2y, which is [N3301][0]. Bolded text is new text when inlined into an existing sentence.

Modify 6.7 “Declarations”:

Modify the first sentence of 6.7.1, paragraph 12 so that it does not imply a declaration can have no type specifier:

A declaration such that the declaration specifiers contain a placeholder type specifier (6.7.10) or that is declared with constexpr is said to be underspecified.

Add a forward reference to the new subclause in 6.7.3 "Type specifiers":

Forward references: declarators (6.7.7), enumeration specifiers (6.7.3.3), initialization (6.7.11), placeholder type specifiers (6.7.3.7), storage-class specifiers (6.7.2), type inference (6.7.10), type names (6.7.8), type qualifiers (6.7.4).

Modify 6.7.2 "Storage-class specifiers" to remove all mention of auto:

Modify 6.7.3 "Type specifiers":

Add a new entry to the list in paragraph 1:

...
atomic-type-specifier
struct-or-union-specifier
enum-specifier
typedef-name
typeof-specifier
placeholder-type-specifier

Delete the first part of the first sentence of paragraph 2, up to the comma:

At least one type specifier ...

Add a final bullet point to the multiset list in paragraph 2:

...
– enum specifier
– typedef name
– typeof specifier
– placeholder type specifier

Add a new paragraph before paragraph 3:

The placeholder type specifier may appear, at most once, as part of any other type specifier multiset in the above list.

Add a reference to type inference in paragraph 5:

Specifiers for structures, unions, enumerations, atomic types, and typeof specifiers are discussed in 6.7.3.2 through 6.7.3.6. Declarations of typedef names are discussed in 6.7.9. Type inference from the placeholder type specifier is discussed in 6.7.10. The characteristics of the other types are discussed in 6.2.5.

Delete paragraph 6.

Add a sentence to the end of paragraph 7:

Each of the comma-separated multisets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int. If the placeholder type specifier appears alongside any other specifier-qualifier-list in the list above, it is ignored, designating the same type as would be selected if it was not present.

Add a new paragraph after paragraph 7:

When the placeholder type specifier appears as the only type specifier in a specifier-qualifier-list, the designated type is the type inferred from the declaration initializer as discussed in 6.7.10footnote.

footnote) This means it only appears in the declaration specifiers of a declaration, and not in any other context where the name of a type is used (6.7.8).

(This removes the need to point out what optional elements of a declaration appertain to, because the designated type is now concretely associated with the auto specifier in the type position.)

Add forward references to 6.7.8 and 6.7.10 at the end of the section:

Forward references: atomic type specifiers (6.7.2.4), enumeration specifiers (6.7.2.2), structure and union specifiers (6.7.2.1), tags (6.7.2.3), type names (6.7.8), type definitions (6.7.8), type inference (6.7.10).

Modify example 4 in 6.7.7.3 "Array declarators" to remove keyword-highlighting from the word "auto" in comments and change it to the regular word "automatic", as it has nothing to do with the specifier in this context.

Modify 6.7.8 "Type names":

Add a “Constraints” section before paragraph 2:

Constraints

A placeholder type specifier shall not appear as part of a type name.

(This is enough to rule out the use of auto in casts, sizeof, etc. contexts where it makes no sense. It also forbids compound literals, which we may find a use case for allowing later.)

Modify 6.7.10 "Type inference":

Syntax

placeholder-type-specifier:
auto

Constraints

A declaration for which the type is inferred shall contain a placeholder type specifier (6.7.3).

Each init-declarator in such a declaration shall have the form footnote):

declarator = initializer

footnote) in other words, the declaration is always a definition of an object, with an explicit initializer.

The placeholder type specifier shall not appear within the parameters of a function declarator.

If the initializer is a braced-initializer (6.7.11), there shall be an explicit initializer-list, and all elements of the initializer-list shall have the same typefootnote).

footnote) compatible types need not be the same.

If the declaration declares more than one identifier, the same inferred type shall be inferred for each occurrence of the placeholder type specifier in the declaration.

Semantics

The placeholder type specifier designates an inferred type that is deduced from the type of the initializer for an identifier.

If the initializer is an assignment-expression, the type of the declared identifier is the type of the initializer after lvalue, array to pointer or function to pointer conversion, additionally qualified by qualifiers and amended by attributes as they appear in the declaration specifiers, if any 162).

If the initializer is a braced-initializer, the type of the declared identifier is an array type, with an element type that is the type of the initializers of the initializer list footnote). If a size is not provided in an array declarator, it is deduced according to the rules for an array of unknown size (6.7.11).

footnote) No coercion or deduction of a common or composite type is performed, as the type of all initializers in the list is the same, per the constraints.

If the type specified for the declarator is a derived type constructed from the placeholder type specifier, the inferred type is the unqualified version of the type after removing the sequence of type derivations appearing in the declarator from the type of the declared identifier. The type of the initializer shall be the type constructed by reapplying the sequence of type derivations appearing in the declarator to the inferred type.

... then continuing 6.7.10,, delete the NOTE about defining structure and union types.

Replace the last sentence of paragraph 4 (in EXAMPLE 1):

The final type here is a pointer type, even though the declarator is not *p.

Add three new examples after paragraph 9 (EXAMPLE 6):

EXAMPLE 7 When a variable is declared with a derived placeholder type, the inferred type must match the derived-from type of the initializer:

auto x = 10; // no derivation from auto

auto * px1 = &x;       // valid, initializer is a pointer
auto const * px2 = &x; // valid, const is not a derivation
auto * px3 = x;        // invalid, x does not have pointer type

auto ** ppx1 = &px1;       // valid, initializer is a pointer to a pointer
auto const ** ppx2 = &px2; // valid, initializer is a pointer to a pointer to const
auto const ** ppx3 = &px1; // invalid, cannot convert to a pointer to pointer to const

int y[10] = {};

auto * py1 = &y;      // valid, py1 has type int(*)[10]
auto * py2 = y;       // valid, py2 has type int *
auto (*py3)[10] = &y; // valid, pointer to auto[10] has same derivations as pointer to int[10]

int f (int, float);
auto * pf1 = f;              // valid
auto (*pf2)(int, float) = f; // valid, declared type derives from placeholder return type
int (*pf3)(auto, auto) = f;  // invalid, cannot derive from placeholder parameter type

EXAMPLE 8 When a variable is initialized with a braced-initializer, it has an array type inferred from the initializer elements:

auto a1 = { 1, 2, 3 }; // type is int[3]
auto a2[] = { 1, 2 };  // type is int[2]
auto a3 = { 1 };       // type is int[1]
auto a4[] = {};        // invalid, must have an initializer

auto a5[] = { [5] = 0 };  // type is int[6]
auto a6[3] = { [5] = 0 }; // invalid, [5] is not within the array
auto a7 = { 1u, 2, 3.0 }; // invalid, initializer types are not the same

EXAMPLE 9 When a declaration contains multiple identifiers to declare, the inferred type for each declarator needs to be the same, but the final object type can be different.

auto x = 10, y = 20;  // valid, both the same
auto w = 10, *z = &x; // valid, both infer int
                      // even though object types are different

auto a = 5, b = 6.0;  // invalid, infer different types
auto *c = &x, d = z;  // invalid, inferring different types (int and int*)
                      // even though object types are the same

Add a forward reference:

Forward references: initialization (6.7.10).

Modify 6.9 "External definitions", second sentence of paragraph 2:

A placeholder type specifier shall only appear in the declaration specifiers in an external declaration if the declaration is a definition and the type is inferred from the initializer.

Remove implementation-defined behaviour J.3.13 (2).

Remove J.5.12 "Type inference".

(as currently specified in the text provided above, all behaviours should be well-defined; we might choose to make some elements of this specification undefined again to allow for extensions like int (*f) (auto, auto) = ..., but there is no precedent for this)

Questions for WG14

Would WG14 like to explicitly allow declarators with forms other than plain identifiers to be declared using type inference, in C2y?

Would WG14 like to change the definition of auto from a storage-class specifier to a placeholder-type specifier, in C2y?

Would WG14 like to remove the implementation-defined behaviour associated with underspecified declarations introducing more than one identifier, making this well-defined in C2y?

References

C2y public draft (N3301)
C++23 public draft (N4950)
C23 CD ballot comments
N3007 Type inference for object definitions