_Generic primary expressionThis issue has been automatically converted from the original issue lists and some formatting may not have been preserved.
Authors: WG14, Jens Gustedt
Date: 2015-04-24
Reference document: N1930
Submitted against: C11 / C17
Status: Fixed
Fixed in: C17
Cross-references: 1001
Converted from: n2396.htm
This is a follow up of the now closed DR 423 which resulted in the clarification of the status of qualifications of rvalues.
This defect report aims to clarify the status of the controlling expression of
_Generic primary expression:
Does the controlling expression of a _Generic primary expression undergo
any type of conversion to calculate the type that is used to do the
selection?
Implementers have given different answers to this question; gcc (choice 1 in
the following) on one side and clang and IBM (choice 2) on the other side went
quite opposite ways, resulting in severe incompatibility for _Generic
expression that use qualifiers or arrays.
char const* a = _Generic("bla", char*: "blu");                 // clang error
char const* b = _Generic("bla", char[4]: "blu");               // gcc error
char const* c = _Generic((int const){ 0 }, int: "blu");        // clang error
char const* d = _Generic((int const){ 0 }, int const: "blu");  // gcc error
char const* e = _Generic(+(int const){ 0 }, int: "blu");       // both ok
char const* f = _Generic(+(int const){ 0 }, int const: "blu"); // both error
The last two lines, where gcc and clang agree, points to the nature of the problem: gcc treats all such expressions as rvalues and does all applicable conversions of 6.3.2.1, that is lvalue to rvalue and array to pointer conversions. clang treats them as lvalues.
The problem arises to know whether or not the conversions of 6.3 apply to the controlling expression.
_Generic is not an operator, but a primary expression. The wording in 6.5.1.1 is has a type and doesn't make any reference to type conversion._Generic either, which are listed in 6.5.1.1.Applying promotions would have as an effect that we wouldn't be able to
distinguish narrow integer types from int. There is no indication that the
text implies that form or conversion, nor that anybody has proposed to use
_Generic like this.
All conversion in 6.3.2.1 p2 describe what would in normal CS language be
named the evaluation of an object. It has no provision to apply it to types
alone. In particular it includes the special clause that uninitialized
register variables lead to undefined behavior when undergoing lvalue
conversion. As a consequence:
Any lvalue conversion of an uninitialized register variable leads to
undefined behavior.
And thus
Under the hypothesis that the controlling expression undergoes lvalue
conversion, any _Generic primary expression that uses an uninitialized
register variable as controlling expression leads to undefined behavior.
In view of the resolution of DR 423 (rvalues drop qualifiers) using _Generic
primary expressions with objects in controlling expression may have results that
appear surprising.
#define F(X) _Generic((X), char const: 0, char: 1, int: 2)
char const strc[] = "";
F(strc[0])   // -> 0
F(""[0])     // -> 1
F(+strc[0])  // -> 2
So the problem is here, that there is no type agnostic operator that results in
a simple lvalue conversion for char const objects to char; all such
operators also promote char to int.
Under the hypothesis that the controlling expression doesn't undergo
conversion, any _Generic primary expression that uses a qualified lvalue of
narrow type T can't directly trigger the association for T itself.
For many areas the two approaches are feature equivalent, that is both allow to implement the same semantic concepts, but with different syntax. Rewriting code that was written with one of choices in mind to the other choice is in general not straight forward and probably can't be automated.
Code that was written with choice 1 in mind (enforced lvalue and array conversion) when translated to choice 2 has to enforce such conversions. E.g as long as we know that the type of X is only a wide integer type or an array or pointer type, a macro such as
        #define bla(X) _Generic((X), ... something ... )
would have to become
        #define bla(X) _Generic((X)+0, ... something ... )
Writing code that takes care of narrow integer types is a bit more difficult, but can be done with 48 extra case selections, taking care of all narrow types (6) and all their possible qualifications (8, restrict is not possible, here). Code that uses struct or union types must use bizarre things like 1 ? (X) : (X) to enforce lvalue conversion.
        #define blaOther((X),                                  \
          char: blub, char const: blub, ...,                   \
          short: ...,                                          \
          default: _Generic(1 ? (X) : (X), struct toto: ... )
        #define bla(X) _Generic((X)+0, ... something ... ,     \
          default: blaOther(X))
Code that was written with choice 2 in mind (no lvalue or array conversion) when translated to choice 1 has to pass to a setting where qualifiers and arrays are preserved in the type. The only such setting is the address-of operator &.
        #define blu(X) _Generic((X), \
           char const: blub,         \
           char[4]: blob,            \
           ...)
has to be changed to something like
        #define blu(X) _Generic(&(X),\
          char const*: blub,         \
          char(*)[4]: blob,          \
          ...)
That is each individual type selection has to be transformed, and the syntactical change that is to be apply is no simple textual replacement.
Since today C implementations have already taken different paths for this
feature, applications should be careful when using _Generic to remain in the
intersection of these two interpretations. A certain number of design questions
should be answered when implementing a type generic macro:
struct types?The following lists different strategies for common scenarios, that can be used to code type generic macros that will work with both of the choices 1 or 2.
This is e.g the case of the C library interfaces in <tgmath.h>. If we know
that the possible type of the argument is restricted in such a way, the easiest
is to apply the unary plus operator +, as in
  #define F(X) _Generic(+(X),             \
    default: doubleFunc,                  \
    int: intFunc,                         \
    ...                                   \
    _Complex long double: cldoubleFunc)(X)
  #define fabs(X) _Generic(+(X),          \
    default: fabs,                        \
    float: fabsf,                         \
    long double: fabsl)(X)
This + sign ensures an lvalue to rvalue conversion, and, that it will error
out at compilation time for pointer types or arrays. It also forcibly promotes
narrow integer types, usually to int. For the later case of fabs all integer
types will map to the double version of the function, and the argument will
eventually be converted to double before the call is made.
If we also want to capture pointer types and convert arrays to pointers, we
should use +0.
  #define F(X) _Generic((X)+0),           \
    default: doubleFunc,                  \
    char*: stringFunc,                    \
    char const*: stringFunc,              \
    int: intFunc,                         \
    ...                                   \
    _Complex long double: cldoubleFunc)(X)
This binary + ensures that any array is first converted to a pointer; the
properties of 0 ensure that this constant works well with all the types that
are to be captured, here. It also forcibly promotes narrow integer types,
usually to int.
If we k now that a macro will only be used for array and pointer types, we can
use the [] operator:
  #define F(X) _Generic(&((X)[0]),        \
    char*: stringFunc,                    \
    char const*: stringFunc,              \
    wchar_t*: wcsFunc,                    \
    ...                                   \
    )(X)
This operator only applies to array or to pointer types and would error if present with any integer type.
If we want a macro that selects differently according to type qualification or
according to different array size, we can use the & operator:
  #define F(X) _Generic(&(X),        \
    char**: stringFunc,              \
    char(*)[4]: string4Func,         \
    char const**: stringFunc,        \
    char const(*)[4]: string4Func,   \
    wchar_t**: wcsFunc,              \
    ...                              \
    )(X)
The above discussion describes what can be read from the text of C11, alone, and not the intent of the committee. I think if the committee would have wanted a choice 2, the standard text would not have looked much different than what we have, now. Since also the intent of the committee to go for choice 1 seems not to be very clear from any additional text (minutes of the meetings, e.g) I think the reading of choice 2 should be the preferred one.
Amend the list in footnote 121 for objects with register storage class. Change
Thus, the only operators that can be applied to an array declared with storage-class specifier
registeraresizeofand_Alignof.
Thus, an identifier with array type and declared with storage-class
specifier register may only appear in primary expressions and as operand to
sizeof and _Alignof.
Change 6.5.1.1 p3, first sentence
The controlling expression of a generic selection is not evaluated and the type of that expression is used without applying any conversions described in Section 6.3.
Add _Generic to the exception list in 6.3.2.1 p3 to make it clear that array
to pointer conversion applies to none of the controlling or association
expression if they are lvalues of array type.
Except when it is the controlling expression or an association expression of a
_Genericprimary expression, or is the operand of thesizeofoperator, the_Alignofoperator, or the unary&operator, or is a string literal used to initialize an array, an expression that has type “array of type” is converted to an expression with type “pointer to type” that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
Also add a forward reference to _Generic in 6.3.2.
If the intent of the committee had been choice 1 or similar, bigger changes of the standard would be indicated. I only list some of the areas that would need changes:
_Generic from primary expressions to a proper subsection, and rename the feature to _Generic operator.Also, add _Generic to the exception list in 6.3.2.1 p3 to make it clear that
array to pointer conversion applies to none of the association expression if
they are lvalues of array type.
Except when it is an association expression of a
_Genericexpression, or is the operand of thesizeofoperator, the_Alignofoperator, or the unary&operator, or is a string literal used to initialize an array, an expression that has type “array of type” is converted to an expression with type “pointer to type” that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
A third possibility would be to leave this leeway to implementations. I strongly object to that, but if so, I would suggest to add a phrase to 6.5.1.1 p3 like:
... in the default generic association. Whether or not the type of the controlling expression is determined as if any of conversions described in Section 6.3 are applied is implementation defined. None of the expressions ...
Comment from WG14 on 2017-11-03:
Oct 2015 meeting
_Generic proposal that the intent was that selecting on qualified types was explicitly to be avoided as was selecting on arrays by size. The intent of _Generic was to give C a mechanism to somewhat express the notion of “overloaded function” found in C++, and in particular a possible mechanism for implementors to use to implement the atomic type generic functions from section 7.17.7. Although this sentiment is most closely reflected in Choice 1 above, and it is reported that clang has also now adopted that approach, the committee feels that the wording in the Suggested Technical Corrigendum is not appropriate._Generic primary expression.The type of the controlling expression of a generic selection is the unqualified type determined by applying the lvalue conversions described in 6.3.2.1p2 as if by evaluation.
Apr 2016 meeting
The paper N2001 was presented and, with revision, adopted as the Proposed Technical Corrigendum below.
Oct 2016 meeting
It was noted that bitfields are of integer type.
In §6.5.1.1p2 change:
The controlling expression of a generic selection shall have type compatible with at most one of the types named in its generic association list.
to
The type of the controlling expression is the type of the expression as if it had undergone an lvalue conversionnew, array to pointer conversion, or function to pointer conversion. That type shall be compatible with at most one of the types named in the generic association list.
new)lvalue conversion drops type qualifiers.