document: | PL1/SC22/WG14 N3072 |
date: | 2022-12-23 |
editor: | Jens Gustedt |
So the text in the introduction should not refer to one
This NB comment is not meant as a criticism towards the editors, who have done an incredibly good job, but merely to give them the permission and opportunity to correct such small glitches.
<uchar.h>
The type uchar8_t
is missing in the listing of symbols added to <uchar.h>
.
__has_embed
should have symbolic namesPlease add magic constants even during preprocessing and introduce symbolic names to the effect of
##define __STDC_EMBED_NOT_FOUND__ 0
##define __STDC_EMBED_FOUND__ 1
##define __STDC_EMBED_EMPTY__ 2
For our users the easiest would be to change 6.10.1 p7
The resource (6.10.3.1) identified by the header-name preprocessing token sequence in each contained has_embed expression is searched for as if those preprocessing token were the pp-tokens in a #embed directive, except that no further macro expansion is performed. Such a directive shall satisfy the syntactic requirements of a #embed directive. The has_embed expression evaluates to the same value as the following mandatory macros (6.10.9.1) :
–
0__STDC_EMBED_NOT_FOUND__
if the search fails or if any of the embed parameters in the embed parameter sequence specified are not supported by the implementation for the#embed
directive; or,
–
1__STDC_EMBED_FOUND__
if the search for the resource succeeds and all embed parameters in the embed parameter sequence specified are supported by the implementation for the#embed
directive and the resource is not empty; or,
–
2__STDC_EMBED_EMPTY__
if the search for the resource succeeds and all embed parameters in the embed parameter sequence specified are supported by the implementation for the#embed
directive and the resource is empty.
Add an item to 6.10.9.1
__STDC_EMBED_NOT_FOUND__
,__STDC_EMBED_FOUND__
, and__STDC_EMBED_EMPTY__
expand to the values
0
,1
and2
, respectively.
If that is not possible please consider to add such symbolic names to a C library header, perhaps <stddef.h>
.
#embed
offers multiple ways to express the same featurePlease re-synchronize with WG21 and the version of #embed
that has been adopted there, to avoid too much implementation complexity and ensure forward compatibility. In particular, WG21 expressed a slight preference for a version of embed without is_empty
/suffix
/prefix
(which was presented as an optional feature), but there is also a desire to use the same feature as WG14. Please reconsider the necessity of that optional part of the adopted proposal.
WG14 does not stick to their announced policies concerning addition of identifiers. Proposed-C23 is quite ambiguous with that respect: on the one hand it clarifies the rules by introducing the term “potentially reserved identifier” on the other hand, in direct violation of the policy announced in ISO/IEC 9899:2018, it newly reserves hundreds of unprefixed identifiers with marginal use in the library clause. Some of these are even short abbreviations or common English words, that have an increased risk of collision with identifier is the application realm.
This could be made less worrisome if identifiers that are added but are optional would be “potentially reserved” and not “reserved”. For example, this would in particular ease the pain for free-standing environments that do not intend to implement the decimal floating point option.
Alternatively, these new decimal floating point functions could use a prefix such as stdc_
to avoid clashing with existing code.
The reading of the new paragraph about “potentially reserved identifiers” 6.4.2.1 p10 is not clear in how optional identifiers fit in. Do they become reserved because an implementation implements the feature or are they reserved upfront? An answer to that is not completely trivial, because for example optional macros are often used as feature tests (so they should always be reserved) whereas optional library interfaces (such as atomics, decimal float, or Annex K) only interfere on implementations that choose to implement them.
Proposed change in 6.4.2.1 p10:
Some identifiers may be potentially reserved. A potentially reserved identifier is an identifier which is not reserved unless made so by an implementation providing the identifier (7.1.3) but is anticipated to become reserved by an implementation or a future version of this document. An identifier that this document describes as optional:
- If it is defined as a macro it is reserved.
- Otherwise, if the definition is given in clauses 1 to 6 it is reserved.
- Otherwise, it is potentially reserved.
Most implementations probably already behave according to this; not much of them warn for optional identifiers they don’t implement. Otherwise, more sophisticated tools may me mildly impacted in that they’d have to change their diagnostic for misuse of these identifiers from “reserved” to “potentially reserved”.
union
s if they are redeclared.If there is no designated initializer, union
s are initialized as if for their first member. In the proposed wording, redeclaration of a union
makes it ambiguous and scope dependent which member would be considered first. Thus initialization of union
s becomes fragile and may, for example, change when the include order of headers is changed.
Please fix this to either
The latter is the preferred solution by AFNOR because it is also much easier to implement. Which ever requirement is chosen, it should be made a constraint, and not only made UB. This could e.g be done in 6.7.2.3 p1
Where two declarations that use the same tag declare the same type, they shall both use the same choice of struct, union, or enum. If two declarations of the same type have a member-declaration or enumerator-list, one shall not be nested within the other and both declarations shall fulfill all requirements of compatible types (6.2.7) with the additional requirement that corresponding members of structure or union types shall appear in the same order and shall have the same (and not merely compatible) types.
Then, the provided examples also need to be adapted
union
s named bar
to Example 2 and add a comment to the second lineunion bar { int x; float y; };
union bar { float y; int x; }; // members are ordered differently
nullptr_t
When introducing nullptr
and nullptr_t
one case was overlooked for equality comparison in 6.5.9 p2 (constraints) last item:
– one operand is a pointer and the other is a null pointer constant or has type
nullptr_t
.
Similarly, the corresponding prose in p6 (sematics) should be adapted
Otherwise, at least one operand is a pointer. If one operand is a pointer and the other is a null pointer constant or has type
nullptr_t
, they compare equal if the former is a null pointer, the null pointer constant is converted to the type of the pointer. If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.
The whole paragraph p2 and in particular its first sentence makes no sense in this version because there are no functions asctime_r
or ctime_r
that are defined in the document.
Modify as follows:
Functions
asctime
,ctime
,gmtime
,andlocaltime
are the same as their counterparts suffixed with_r
. In place of the parameterbuf
, they use a pointer toan object and return it:one or two broken-down time structures(for. Similarly, an array ofgmtime
andlocaltime
) orchar
(is commonly used byasctime
andctime
). Execution of any of the functions that return a pointer to one of these objects may overwrite the information returned from any previous call to one of these functions that uses the same object. These functions are not reentrant and are not required to avoid data races with each other. Accessing the returned pointer after the thread that called the function that returned it has exited results in undefined behavior. The implementation shall behave as if no other library functions call these functions.
_BitInt(
N)
to be implemented as macrosThe mandatory addition of _BitInt(
N)
is quite a stretch for small C parsers. Introducing this as a simple identifier with “functional” argument would make this addition much more friendly for implementations that already have some form of operator overloading (including suffixes); using macros, _Generic
and the new typeof
feature would permit to implement the whole feature as a library.
We should allow implementations to use all the machinery they already have, without imposing changes to their parsers. This could easily be achieved by adding _BitInt
to the list of keywords that may be expand to different forms when handled by #
and ##
during preprocessing. Modify the last sentence of 6.4.1 (Keywords) p3 that talks about such exceptions:
The spelling of these keywords, their alternate forms, and of
_BitInt
,false
andtrue
inside expressions that are subject to the#
and##
preprocessing operators is unspecified.75)
The proposed document follows two different strategies concerning version numbers
__STDC_VERSION_
…H__
macros for headers all expand to 202311L
, regardless on the effective intermediate document by which they were introduced. This value is the same as __STDC_VERSION__
.__has_c_attribute
is different for different attributes.Because attributes changed at some points, this distinction might have been interesting for implementations and early adopters during the development phase of this revision. Now as the dust settles, it is of minor importance and should disappear for the benefit of simplicity and our general users.
AFNOR prefers that any calls __has_c_attribute(ID)
for a standard attribute ID
return the same value as __STDC_VERSION__
, namely 202311L
. We’d also like to suggest that for the work on future revisions this policy is maintained. It should be easier to have to maintain and to follow a single version number that reflects a possible working draft than to know these numbers for all features that might change during the elaboration of a revision.
ISO/IEC policy has changed such that now they allow highlighting colors for code snippets. Please consider to use that feature not only in working drafts but also for final documents.
In the working draft, colors of identifiers distinguish the status of the underlying definition; keywords are black, macros are red, types are blue etc. Knowing this color code improves readability of the standard and this should not be kept from our end users.
timegm
This function is added to the <time.h>
but a reference to that change is missing in Annex M.
A reference to the corresponding paper N2833 is also missing in the abstract but this would be removed anyhow for the published document.
PRI
and SCN
macros are missing for new format specifiersThe new format specifiers %b
(and optionally %B
) for printf
and scanf
should be as useful as the existing ones. For that they should also have equivalent macro definitions in <inttypes.h>
as do the other format specifiers. Therefore lines
PRIb
NPRIbLEAST
NPRIbFAST
NPRIb
MAXPRIbPTR
and
SCNb
NSCNbLEAST
NSCNbFAST
NSCNb
MAXSCNbPTR
should be added to 7.8.1 p3 and p5, respectively.
printf
format specifier %B
optional.The specifier %B
is only recommended and not required because it might be in conflict with existing extensions on some implementations. It would be good to change this recommendation into a proper option such that semantics of this specifier can be tested and then relied upon.
%B
optional.Modify in 7.23.6.1 and 7.31.2.1.
Describe the feature:
#
The result is converted to an “alternative form”. For
o
conversion, it increases the precision, if and only if necessary, to force the first digit of the result to be a zero (if the value and precision are both0
, a single0
is printed). Forb
conversion, a nonzero result has0b
prefixed to it. For the optionalB
conversion as described below, a nonzero result has0B
prefixed to it. Forx
(orX
) conversion, a nonzero result has0x
(or0X
) prefixed to it. …
Add it to the possible specifiers:
b
,B
,o
,u
,x
,X
The
unsigned int
argument is converted to unsigned binary (b
orB
), unsigned octal (o
), unsigned decimal (u
), or unsigned hexadecimal notation (x
orX
) in the styledddd
; the lettersabcdef
are used forx
conversion and the lettersABCDEF
forX
conversion. The precision specifies the minimum number of digits to appear; if the value being converted can be represented in fewer digits, it is expanded with leading zeros. The default precision is1
. The result of converting a zero value with a precision of zero is no wide characters. The specifierB
is optional and provides the same functionality asb
, except for the#
flag as specified above. The macroPRIBPTR
from<inttype.h>
shall only be defined if the implementation follows the specification as given here.
Change the text in the “recommended practice”
14
AnThe uppercaseB
format specifier isnot coveredmade optional by the description above, because it used to be available for extensions in previous versions of this standard.
{linebreak}Implementations that did not use an uppercase
B
as their own extension before are encouraged to implement itsimilar to conversion specifieras an option as described above.b
as standardized above, with the alternative form (#B
) generating0B
as prefix for nonzero values
PRIB
N and similarAdd to 7.8.1 after p3
3’ The following
printf
macros for unsigned integer types are optional:
PRIB
NPRIBLEAST
NPRIBFAST
NPRIB
MAXPRIBPTR
They shall be defined if the implementation supports the
B
specifier as indicated in 7.23.6.1 and 7.31.2.1; otherwise they shall not be defined.
Add to 7.33.6 (<inttypes.h>
) p1:
Macros that begin with either
PRI
orSCN
, and either a lowercase letter,B
, orX
are potentially reserved identifiers and may be added to the macros defined in the<inttypes.h>
header.
Add to 7.33.14 p1 (<stdio.h>
) and analogously to 7.33.20 (<wchar.h>
) p2:
Lowercase letters may be added to the conversion specifiers and length modifiers in
fprintf
andfscanf
. Other characters may be used in extensions. The specifierB
forprintf
may become mandatory in future versions of this document.
strtol
, scanf
and similar functionsInteger conversion functions strtol
and similar now accept the new binary integer constants if they encounter the prefix 0b
or 0B
. This can happen if the number base is explicitly provided as 2
or implicitly by providing a 0
. This is a semantic change: for example the following code
unsigned long res = strtoul("0b1", 0, 2);
has res ≡ 0
for C17 and res ≡ 1
for C23, because for the first the interpretation stops before the b
. This semantic change concerns strtol
and similar functions for base 0
and 2
. Because scanf
always uses an explicit format specifier and C17 had no specifier for base 2
, it is not affected by this change.
It is not clear how C libraries are supposed to handle this:
Should they provide one such function that when linked (possibly dynamically) to existing code changes semantics of an execution after an update of the C library?
Should they provide two functions, an old C17 one and a new C23 one? How should a call resolve:
__STDC_VERSION__
) ?strtol
or scanf
For C17 integer constants that are accepted as literals and those that are accepted by tools such as strtol
or scanf
are the same. Therefore code as the following works for any implementation where a valid C17 integer literal is supplied for ULLONG_MAX
.
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#define STRINGIFY_(X) #X
#define STRINGIFY(X) STRINGIFY_(X)
unsigned long long what(char const* s) {
return strtoull(s, 0, 0);
}
char const elements[] = STRINGIFY(ULLONG_MAX);
int main(int argc, char* argv[argc+1]) {
char const* p = (argc > 1) ? argv[1] : elements;
if (what(p) < 127) {
("unusual platform with %s max\n", p);
printf}
}
Here the initializer of elements
would typically be a string such as "0xffffffffffffffffLL"
or "18446744073709551615"
, which then would be correctly recognized as number and skip the call to printf
.
In C23, if the implementation chooses to change 18446744073709551615
to 18'446'744'073'709'551'615
(such as to improve readability) the call to strtoull
would only see the leading 18
and run into the branch with the call to printf
.
The corresponding semantic change effects integer and floating point literals and extends to all variants of strto*
and scanf
. Also, this problem equally concerns strings that are formed from literals that are part of the implementation or any other third party code base that is maintained independently. The potential semantic change of user code makes the new extended syntax for number literals with digit separators non-portable and even security critical.
It is difficult to tell the impact that these changes will have in the field. Problematic input sequences composed with the pairs 0b
or 0B
that would lead to changes in behavior are difficult to detect or predict, in particular for programs that treat large inputs from non-sanitized sources dynamically. Because of this, such programs may malfunction for a long time silently before problems would be detected.
The second problem would probably have the consequence that the new digit separator will be banned by coding styles and secure coding guidelines, which undermines the usefulness of this new feature.
0b
and 0B
prefixes for integer literals from C23.'
digit separator from C23.strto*
and scanf
functions as they are in C17 and mark them as [[deprecated]]
(but not necessary as obsolescent).stdc_str*
, stdc_scanf
and similar that additionally to their C17 counterparts allow the 0b
and 0B
prefixes and that deal with the '
digit separator.#if __STDC_VERSION__ >= 202311L
[[deprecated("strings with a '0b' or '0B' prefix may have changed meaning in C23, consider using stdc_strol")]]
long int strtol(const char*restrict, char **restrict, int);
long int stdc_strtol(const char *restrict nptr, char **restrict endptr, int base);
#else
long int strtol(const char*restrict, char **restrict, int);
#endif
va_start
becomes too permissiveThis macro previously had the name of the last named argument as second parameter. This is changed to ...
with the intent that this parameter can now be omitted. Unfortunately this allows to have any number of arguments of any kind and with the proposed text implementations would not even be allowed to diagnose such code.
We think that the proposed specification goes too far in that it not only guarantees that these arguments are not evaluated (which should be maintained) but also that it should not be expanded. We propose to change as follows.
Only the first argument passed to va_start is evaluated. Any additional arguments are not used
by the macroother than for possible diagnostics and will not beexpanded orevaluated for any reason.
Alternatively the last sentence could just be suppressed.
Only the first argument passed to va_start is evaluated.
Any additional arguments are not used by the macro for other purpose that possible diagnostics and will not be expanded or evaluated for any reason.