2024-04-14
document number | date | comment |
---|---|---|
n3189 | 202310 | original proposal |
n3239 | 202404 | this paper |
→ ordinary {character¦string} literal | ||
precise lists of changes | ||
proposed git branch for LaTeX sources |
CC BY, see https://creativecommons.org/licenses/by/4.0
The C standard differentiates two terms that have surprisingly different meaning:
“integer constant”, which describes a certain category of tokens after lexing
“integer constant expression”, which associates special properties to a non-terminal in the grammar
This even leads to confusion in the standard itself, because the definitions of sizeof
and alignof
cyclicly refer to the definition of “integer constant” and “integer constant expression”. In other places the term “integer constant” is seemingly used with a different meaning than its definition, namely in places where talking about a constant of integer type would be in order.
The goal of this paper is to rename the terms “integer constant”, “floating constant”, “character constant” to “integer literal”, “floating literal” and “character literal”. A table summarizing these systematic changes can be found towards the end.
This is the case because the long list of cases mixes cases that talk about lexical concepts and others that talk about semantic properties of certain subexpressions. We propose to make these distinctions clearer by talking consistently about “literals” when we address a lexical feature, and talk about ‘’constants’’ as a semantic concept that is attached to certain grammatical entities, but which is in general not deducible from local syntactic features.
We propose to make the following changes.
p8 An integer constant expression130) shall have integer type and shall only have operands that are integer
constantsliterals, named constants,andcompound literal constants of integer type, characterconstantsliterals,sizeof
expressions whose results are integer constants expressions,alignof
expressions, and floating literals, named constants, or compound literal constants of arithmetic type that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the typeof operators,sizeof
operator, oralignof
operator.
…
p10 An arithmetic constant expression shall have arithmetic type and shall only have operands that are integer
constantsliterals, floatingconstantsliterals, named constants or compound literal constants of arithmetic type, characterconstantsliterals,sizeof
expressions whose results are integer constants expressions, andalignof
expressions. Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types, except as part of an operand to the typeof operators,sizeof
operator, oralignof
operator.
This misses in 6.4.8 (Preprocessing numbers) that some pp-number tokens are already interpreted as integer literals during pre-processing itself.
3 Preprocessing number tokens lexically include all floating and integer
constant tokensliterals.
Semantics
4 A preprocessing number does not have type or a value; it acquires both after a successful conversion (as part of translation phase 7) to a floating
constant tokenor integerconstant tokenliteral. This not withstanding for the evaluation of expressions within conditional source inclusion (6.10.1) and to determine thelimit
parameter for binary resource inclusion (6.10.3), preprocessing numbers have the form of an integer literal and are interpreted as such. For the determination of a line number in a#line
directive (6.10.5) digit sequences that also match the requirements for a preprocessing number are interpreted as numbers as well, only that the interpretation is of a decimal integer, even if the leading digit is0
.
Already in the existing text the term “integer character constant” is a misnomer, because in fact all character constants have integer type. With the proposed changes “integer character constant”, would now become “integer character literal”, which is equally confusing. In alignment with C++, we propose change this term to “ordinary character literal”.
In some occurrence, the term “integer character constant” is then currently used as if it could also include UTF-8 character constants/literals. Therefor we propose to introduce a new term “narrow character literal” that comprises these two cases. Changes are in 6.4.4.5 (Character constants), p2:
A
integern ordinary characterconstantliteral is a sequence of one or more multibyte characters enclosed in single-quotes, as in’x’
. A UTF-8 characterconstantliteral is the same, except prefixed byu8
. Together ordinary and UTF-8 character literal are narrow character literals. Awchar_t
characterconstantliteral is prefixed by the letterL
. A UTF-16 characterconstantliteral is prefixed by the letteru
. A UTF-32 characterconstantliteral is prefixed by the letterU
. Collectively,wchar_t
, UTF-16, and UTF-32 characterconstantsliterals are called wide characterconstantsliterals. …
Similarly, the existing term “character string literal” is very confusing, because all string literals consist of characters. Therefor we propose to rename these to “ordinary string literal”, which is consistent with the new naming for character literals.
It seems that unfortunately the terminology for strings that the Library section introduces has very similar problems. These problems will be tackled by a specific paper.
First do the changes as indicated above. Then in the order as shown
C23 | C2y | |
---|---|---|
integer character constant | ordinary character literal | as above and in 6.4.4.5 p11, p17 and p18, in 6.10.10.3, 7.21 p3, I .2, J .3.5 (9) |
integer character constant | narrow character literal | 6.4.4.5 p5 and p6, 6.4.5 p4, |
character constant | character literal | otherwise |
integer constant | integer constant expression | 6.5.4.4 p2, 6.9.1 p3, 7.22.4 p1, 7.29.1 p2, |
integer constant | constant of integer type | 7.22.4 title, 7.22.41 title, 7.22.4.2 title |
named integer constant value | named constant of integer type | 6.2.25 p21 |
integer constant token | integer literal | 6.4 p5, 6.4.8 p3 and p4 |
floating constant token | floating literal | 6.4 p5, 6.4.8 p3 and p4 |
integer constant | integer literal | other than in integer constant expression and M .5 p1 |
decimal constant | decimal literal | |
hexadecimal constant | hexadecimal literal | |
octal constant | octal literal | |
binary constant | binary literal | |
floating constant | floating literal | |
decimal floating constant | decimal floating literal | |
hexadecimal floating constant | hexadecimal floating literal | |
fractional constant | fractional literal | |
hexadecimal fractional constant | hexadecimal fractional literal | |
character string literal | ordinary string literal |
The branch “literals” in the WG14 git repository contains an implementation of the proposed changes to the LaTeX document.
Since it has changes all over the place we organized them in a special way
occurences of the following terms are replaced with LaTeX commands in curly braces of the form {\XXX}
as indicated:
\newcommand{\ICE}{integer constant expression}
\newcommand{\CE}{constant expression}
\newcommand{\ACE}{arithmetic constant expression}
\newcommand{\ADC}{address constant}
\newcommand{\NPC}{null pointer constant}
\newcommand{\EC}{enumeration constant}
\newcommand{\CLC}{compound literal constant}
\newcommand{\NC}{named constant}
If the editors think that these should be reversed, this can easily be done, afterwards, but for the review phase they should be kept.
Reviewers of the change set in the git should first verify if all these terms have been correctly replaced by the corresponding LaTeX command. This can be achieved visually (after fetching the recent state from the server) by the command
git diff --color-words origin/c2y..origin/literals
The changes as proposed by this document have only been applied after that. With the above git
command these then should mainly appear as the replacements of the string “constant” by the string “literal”.
After having checkout the literals
branch, the LaTeX sources can then be checked for the occurrence of remaining strings “constant”, for example:
grep --color -inH --null -e constant *.tex
These should be very much restricted and easily verified as being correct usages of the word, such as for mathematical constants or as in “constant rounding mode”.