Proposal for C2y
WG14 3356
Title:               `if` declarations, v3
Author, affiliation: Alex Celeste, Perforce
Date:                2024-09-18
Proposal category:   New feature
Target audience:     Compiler implementers, users
The C if statement only admits an expression as its operand. In contrast,
the for statement admits either an expression or a declaration as its
first operand. We propose that C modifies the if statement to allow the
operand to be either a declaration or an expression, and to optionally allow
a second expression clause when the first clause is a declaration. This is
taken from existing practice in C++.
Reply-to:     Alex Celeste (aceleste@perforce.com)
Document No:  N3356
Revises:      N3267
Date:         2024-09-18
- rebase against N3301
- fixes to proposed grammar based on feedback
- move more semantics to the toplevel of 6.8.5
- rebase wording on C2y draft
- rework the grammar to use original, simpler rules instead of copying C++
- scope of the identifier persists into the else
- remove changes to for
- original proposal
C99 introduced the ability to treat clause-1 of a for statement as either
an expression or a declaration (as part of the wider change to allow mixing
declarations with code). The benefits of this are by now well-accepted: for a
variable whose sole job is to act as an iterator for the loop, it makes the
most sense for that variable to have its scope be bound as tightly as possible
to the loop. This is so widely accepted that we will not justify it further.
In C++, if has allowed the condition to be either an expression or a
declaration with initializer since C++98. In C++17, this was enhanced to allow
for a second clause in if (and also switch), so that the declaration could
put a name to a temporary which the condition itself could then use in an
expression, functioning altogether much like the first two clauses of for
in C (where the declaration/first expression "does" nothing in terms of control
except exist within scope; the second clause provides the control expression),
without the looping behaviour.
We propose that C should adopt the C++17 enhancement and allow declarations
directly within the condition of if and switch. This provides tighter
scoping of temporaries, and has downstream effects allowing for substantially
simpler definitions of "library control structures" via macro that previously
had to use for and/or anaphoric constructions.
Most of the motivation is thoroughly described by p0305.
An example of a control macro which would be simpler to implement with this
feature is just (and expect), a pseudo-monadic way to safely access the
content of a library Optional type. A just macro admits an Optional as
its operand, and if it has a value, declares that a variable with that value
(or perhaps a native pointer to it, depending on the implementation) as
visible in the operand-scope-block.
In C23, the most straightforward way ends up looking like:
Optional(int) ox = ...
just (ox) {
  useValue (it);
} else {
  useNil ();
}
The declaration of it is implicit (anaphoric) for the operand scope, because
it is impractical to write a macro that would allow the syntax we really want:
if (int * x = just (ox)) {
  ...
}
(in this case the entire control macro is a workaround, although there are also
examples of control macros that would still be useful abstractions building on
top of if (;) - for instance we could rename the if keyword anyway to show
intent)
Further discussion of use cases like this is also covered by p0305.
Alternatives are described by p0305.
None of the suggestions there really help with library control structures,
unless the library author gives up on integration completely and starts
defining new controls with a FORM ... ENDFORM paired-keyword syntax, which
does not integrate well with C at all either in readability, or, more
importantly, composability.
The feature was standardized in C++17 and is now widely used.
The feature standardized here differs slightly from the C++ feature by
not including the C++ grammatical construct condition, which allows
the second clause of if and for to be a second, completely separate
declaration:
if (int x = 0; int y = x + 1) {
  y;
}
Instead of over-complicating the grammar in a single change, and confusing
the change to if with largely-unrelated changes to for, we separate
this aspect out and will consider aligning with this for both statement kinds
in a subsequent proposal instead. Therefore, the second clause to if is
limited to being an expression only in this proposal.
This was an emergent feature that users are unlikely to try to use intentionally.
No existing code is affected by this change.
Having implemented this feature in our C++ compiler, and having similar experiences in the past enhancing other control structures as and when they were upgraded (such as the range-for in C++11), we do not find that there is likely to be any substantial impact to a mature C compiler from a development perspective in adding this feature.
Our C++ compiler was able to add new classes representing different control structures and transparently see them "just work" with existing queries.
In our C compiler, which uses a different architecture, the representation of any control structure is homogeneous anyway; since a control structure was already able to have a declaration and a controlling expression we found that this feature fell out completely naturally, being able to be added with minimal effort (essentially only needing to add the implicit controlling expression for the C++98 syntax).
We expect other tools to have very similar experiences and therefore consider this proposal to have minimal development impact. Any "small" tool should have little trouble integrating this change.
Further discussion of impact is also covered by p0305 and is largely the same for C as it was for C++.
The proposed changes are based on the latest public draft of C2y, which is N3301. Bolded text is new text when inlined into an existing sentence.
(This wording has been altered and is intentionally no longer the same as p0305 because the underlying grammar is not similar enough for the C++ changes to simply "drop in" to C. The intended semantic effect of the changes is still identical.)
Add a new rule to the declaration grammar in 6.7.1 "Declarations", paragraph 1, at the bottom of the current list of grammar rules:
simple-declaration:
attribute-specifier-sequenceopt declaration-specifiers declarator=initializer
Add a new paragraph after paragraph 12:
A simple-declaration is a declarationfootnote that can appear in place of the controlling expression of a selection statement.
footnote) subject to the same constraints as all other declarations with initializers.
(NOTE: by placing this here, it makes clear that this construct is a declaration, and avoids the need to say "subject to the same constraints as ..." in other parts of the document.)
Add a forward reference:
Forward references: declarators (6.7.7), enumeration specifiers (6.7.3.3), initialization (6.7.11), storage-class specifiers (6.7.2), type inference (6.7.10), type names (6.7.8), type qualifiers (6.7.4) , selection statements (6.8.5).
Modify the statement grammar in 6.8.5 "Selection statements", Syntax, paragraph 1:
selection-statement:
if(selection-header)secondary-block
if(selection-header)secondary-blockelsesecondary-block
switch(selection-header)secondary-blockselection-header:
expression
declaration expression
simple-declaration
(This makes failure to initialize the object in the third form a syntax error, and intentionally limits it to declaring a single object.)
Add two new paragraphs to "Semantics", after paragraph 2:
If the selection-header is the first or second form, the controlling expression is the expression of the selection-header. Otherwise, there is no explicit expression and the controlling expression is the value of the single declared object after initialization.
If the selection-header includes a declaration (the second or third forms), the scope of any identifiers it declares is the remainder of the declaration and the entire selection-statement, including the subsequent expression in the second form as well as the secondary-block or blocks. In the second form, the declaration is reached in the order of execution before evaluation of the expression.
Modify 6.8.5.2 "The if statement":
Modify "Semantics", existing paragraph 2:
In both forms, the first substatement is executed if the controlling expression footnote) compares unequal to 0. In the
elseform, the second substatement is executed if the controlling expression compares equal to 0. If the first substatement is reached via a label, the second substatement is not executed.footnote) which is implicitly the initialized value of the declared object, if the selection-header only has a declaration. The implementation is permitted to re-read the object to determine the value, but is not required to do so, even when the object has
volatile-qualified type.
Add two examples:
EXAMPLE The controlling expression of an
ifstatement that uses a declaration as its selection-header is implicitly the value of the declaration:if (int x = get ()) { // x is non-0 here } else { // x is 0 here }is equivalent to
if (int x = get (); x) { // x is non-0 here } else { // x is 0 here }and therefore to
{ int x = get (); if (x) { // x is non-0 here } else { // x is 0 here } }EXAMPLE 2 The controlling expression is of any
ifstatement is always implicitly compared to 0 by the statement itself:double x = DBL_SNAN; if (x) { // fetestexcept (FE_INVALID) is nonzero because of the comparison }
Modify 6.8.5.3 "The switch statement":
Modify paragraph 2 of "Constraints":
If a
switchstatement has an associatedcaseordefaultlabel within the scope of an identifier with a variably modified type, the entire secondary-block of theswitchstatement shall be within the scope of that identifier.183)
and modify footnote 183 accordingly:
183) That is, the declaration either precedes the
switchstatement, or it appears in the selection-header, or it follows the lastcaseordefaultlabel associated with theswitchthat is in the block containing the declaration.
(Adding an example for switch would essentially just be repetition.)
The proposed change to 6.8.6 "Iteration statements" has been removed from this version of the feature proposal and will be revisited at a later time.
Does WG14 want to add declaration-in-selections to C2y using the proposed syntax and wording?
Would WG14 like to see a subsequent paper that unifies the wording with C++
to allow a second declaration in the second clause of if, switch and for?
Huge thanks to Joseph Myers for detailed and thorough review of previous versions.
C2y public draft
N740 Declarations in for
p0305r1 Selection statements with initializer
Anaphoric macros
Example of just and expect
C++17