Proposal for C2y
WG14 3267

Title:               `if` declarations, v2
Author, affiliation: Alex Celeste, Perforce
Date:                2023-11-30
Proposal category:   New feature
Target audience:     Compiler implementers, users

Abstract

The C if statement only admits an expression as its operand. In contrast, the for statement admits either an expression or a declaration as its first operand. We propose that C modifies the if statement to allow the operand to be either a declaration or an expression, and to optionally allow a second expression clause when the first clause is a declaration. This is taken from existing practice in C++.


if declarations, v2

Reply-to:     Alex Celeste (aceleste@perforce.com)
Document No:  N3267
Revises:      N3196
Date:         2024-05-30

Summary of Changes

N3267

N3196

Introduction

C99 introduced the ability to treat clause-1 of a for statement as either an expression or a declaration (as part of the wider change to allow mixing declarations with code). The benefits of this are by now well-accepted: for a variable whose sole job is to act as an iterator for the loop, it makes the most sense for that variable to have its scope be bound as tightly as possible to the loop. This is so widely accepted that we will not justify it further.

In C++, if has allowed the condition to be either an expression or a declaration with initializer since C++98. In C++17, this was enhanced to allow for a second clause in if (and also switch), so that the declaration could put a name to a temporary which the condition itself could then use in an expression, functioning altogether much like the first two clauses of for in C (where the declaration/first expression "does" nothing in terms of control except exist within scope; the second clause provides the control expression), without the looping behaviour.

We propose that C should adopt the C++17 enhancement and allow declarations directly within the condition of if and switch. This provides tighter scoping of temporaries, and has downstream effects allowing for substantially simpler definitions of "library control structures" via macro that previously had to use for and/or anaphoric constructions.

Most of the motivation is thoroughly described by p0305.

Control macros

An example of a control macro which would be simpler to implement with this feature is just (and expect), a pseudo-monadic way to safely access the content of a library Optional type. A just macro admits an Optional as its operand, and if it has a value, declares that a variable with that value (or perhaps a native pointer to it, depending on the implementation) as visible in the operand-scope-block.

In C23, the most straightforward way ends up looking like:

Optional(int) ox = ...

just (ox) {
  useValue (it);
} else {
  useNil ();
}

The declaration of it is implicit (anaphoric) for the operand scope, because it is impractical to write a macro that would allow the syntax we really want:

if (int * x = just (ox)) {
  ...
}

(in this case the entire control macro is a workaround, although there are also examples of control macros that would still be useful abstractions building on top of if (;) - for instance we could rename the if keyword anyway to show intent)

Further discussion of use cases like this is also covered by p0305.

Alternatives

Alternatives are described by p0305.

None of the suggestions there really help with library control structures, unless the library author gives up on integration completely and starts defining new controls with a FORM ... ENDFORM paired-keyword syntax, which does not integrate well with C at all either in readability, or, more importantly, composability.

Prior Art

The feature was standardized in C++17 and is now widely used.

Compatibility

The feature standardized here differs slightly from the C++ feature by not including the C++ grammatical construct condition, which allows the second clause of if and for to be a second, completely separate declaration:

if (int x = 0; int y = x + 1) {
  y;
}

Instead of over-complicating the grammar in a single change, and confusing the change to if with largely-unrelated changes to for, we separate this aspect out and will consider aligning with this for both statement kinds in a subsequent proposal instead. Therefore, the second clause to if is limited to being an expression only in this proposal.

This was an emergent feature that users are unlikely to try to use intentionally.

Impact

No existing code is affected by this change.

Having implemented this feature in our C++ compiler, and having similar experiences in the past enhancing other control structures as and when they were upgraded (such as the range-for in C++11), we do not find that there is likely to be any substantial impact to a mature C compiler from a development perspective in adding this feature.

Our C++ compiler was able to add new classes representing different control structures and transparently see them "just work" with existing queries.

In our C compiler, which uses a different architecture, the representation of any control structure is homogeneous anyway; since a control structure was already able to have a declaration and a controlling expression we found that this feature fell out completely naturally, being able to be added with minimal effort (essentially only needing to add the implicit controlling expression for the C++98 syntax).

We expect other tools to have very similar experiences and therefore consider this proposal to have minimal development impact. Any "small" tool should have little trouble integrating this change.

Further discussion of impact is also covered by p0305 and is largely the same for C as it was for C++.

Proposed wording

The proposed changes are based on the latest public draft of C2y, which is N3220. Bolded text is new text when inlined into an existing sentence.

This wording borrows directly from p0305.

Top-level grammar

Add a new rule to the end of the top-level statement grammar in 6.8 "Statements and blocks", 6.8.1 "General", Syntax, paragraph 1:

init-statement:
expression ;
declaration ;
;

Selection

Modify the statement grammar in 6.8.5 "Selection statements", Syntax, paragraph 1:

selection-statement:
if ( selection-header ) secondary-block
if ( selection-header ) secondary-block else secondary-block
switch ( selection-header ) secondary-block

selection-header:
expression
declaration expression
attribute-specifier-sequenceopt declarator = initializer

(This makes failure to initialize the object in the third form a syntax error, and intentionally limits it to declaring a single object.)

Add two new paragraphs to 6.8.5.2 "The if statement", "Semantics", before paragraph 2:

If the selection-header is the first or second form, the controlling expression is the expression of the selection-header. Otherwise, there is no explicit expression and the controlling expression is the value of the single declared object.

If the selection-header includes a declaration (the second or third forms), the scope of any identifiers it declares is the remainder of the declaration and the entire selection-statement, including the subsequent expression in the second form as well as the first secondary-block, and if present, the secondary-block following else. In the second form, the declaration is reached in the order of execution before evaluation of the expression.

Modify paragraph 2:

In both forms, the first substatement is executed if the expression compares unequal to 0 , or if the selection-header has no expression and the single declared object is initialized with a value unequal to 0. In the else form, the second substatement is executed if the expression compares equal to 0 , or if the selection-header has no expression and the single declared object is initialized with a value equal to 0. If the first substatement is reached via a label, the second substatement is not executed.

Add an example after paragraph 3:

EXAMPLE The controlling expression of an if statement that uses a declaration as its selection-header is implicitly the value of the declaration:

if (int x = get ()) {
  // x is non-0 here
} else {
  // x is 0 here
}

is equivalent to

if (int x = get (); x) {
  // x is non-0 here
} else {
  // x is 0 here
}

and therefore to

{
  int x = get ();
  if (x) {
    // x is non-0 here
  } else {
    // x is 0 here
  }
}

Add a new paragraph to 6.8.5.3 "The switch statement", "Semantics", before paragraph 4:

If the selection-header is the first or second form, the controlling expression is the expression of the selection-header. Otherwise, there is no explicit expression and the controlling expression is the value of the single declared object.

(The example for switch would essentially just be repetition.)

Iteration

The proposed change to 6.8.6 "Iteration statements" has been removed from this version of the feature proposal and will be revisited at a later time.

Questions for WG14

Does WG14 want to declaration-in-selections to C using the proposed syntax and wording?

Would WG14 like to see a subsequent paper that unifies the wording with C++ to allow a second declaration in the second clause of if, switch and for?

References

C2y public draft
N740 Declarations in for
p0305r1 Selection statements with initializer
Anaphoric macros
Example of just and expect
C++17