ISO/IEC JTC1 SC22 WG21 N3661 - 2013-04-19
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
Problem
Solution
Proposal
2.14.2 Integer literals [lex.icon]
2.14.4 Floating literals [lex.fcon]
2.14.8 User-defined literals [lex.ext]
C.new.new Clause 2: lexical conventions [diff.cpp11.lex]
Numeric literals of more than a few digits are hard to read. Consider the following tasks.
7237498123
.237498123
with 237499123
for equality.237499123
or 20249472
is larger.The problem has a long history of solutions in writing and typography, digit separators. In the English-speaking world, commas are usually used to separate digits.
7,237,498,123
.237,498,123
with 237,499,123
for equality.237,499,123
or 20,249,472
is larger.We wish to introduce digit separators into C++. Much discussion of constraints and alternatives appears in N3499. We propose using an underscore (aka low line) as a digit separator and a double radix point (aka double dot) as a disambiguating suffix separator.
Edit the grammar as follows. Editor, note the change to the binary literal syntax as described in N3472.
- integer-literal:
- decimal-literal integer-suffixopt
- octal-literal integer-suffixopt
- hexadecimal-literal integer-suffixopt
- decimal-literal:
- nonzero-digit
- decimal-literal digit-separatoropt digit
- octal-literal:
0
- octal-literal digit-separatoropt octal-digit
- hexadecimal-literal:
0x
hexadecimal-digit0X
hexadecimal-digit- hexadecimal-literal digit-separatoropt hexadecimal-digit
- binary-literal:
0b
binary-digit0b
binary-digit- hexadecimal-literal digit-separatoropt binary-digit
- nonzero-digit: one of
1 2 3 4 5 6 7 8 9
- octal-digit: one of
0 1 2 3 4 5 6 7
- hexadecimal-digit: one of
0 1 2 3 4 5 6 7 8 9
a b c d e f
A B C D E F
- digit-separator:
_
Edit paragraph 1 as follows.
An integer literal is a sequence of digits that has no period or exponent part, with optional digit separators. These separators are ignored when determining its value. .... [Example:
theThe number twelve can be written12
,014
, or0XC
. The literals1048576
,1_048_576
,0X100000
,0x10_0000
, and0_004_000_000
all have the same value. —end example]
Edit the grammar as follows.
- floating-literal:
- fractional-constant exponent-partopt floating-suffixopt
- digit-sequence exponent-part floating-suffixopt
- fractional-constant:
- digit-sequenceopt
.
digit-sequence- digit-sequence
.
- exponent-part:
e
signopt digit-sequenceE
signopt digit-sequence- sign: one of
+ -
- digit-sequence:
- digit
- digit-sequence digit-separatoropt digit
Edit within paragraph 1 as follows.
.... The integer and fraction parts both consist of a sequence of decimal (base ten) digits, with optional digit separators. These separators are ignored when determining the value. [Example: The literals
1.602_176_565e-19
and1.602176565e-19
have the same value. —end example] ....
Edit the grammar as follows. Editor, note the change to the binary literal syntax as described in N3472.
- user-defined-literal:
- user-defined-integer-literal
- user-defined-floating-literal
- user-defined-string-literal
- user-defined-character-literal
- user-defined-integer-literal:
- decimal-literal
ud-suffixseparated-suffix- octal-literal
ud-suffixseparated-suffix- hexadecimal-literal
ud-suffixseparated-suffix- binary-literal
ud-suffixseparated-suffix- user-defined-floating-literal:
- fractional-constant exponent-partopt
ud-suffixseparated-suffix- digit-sequence exponent-part
ud-suffixseparated-suffix- user-defined-string-literal:
- string-literal ud-suffix
- user-defined-character-literal:
- character-literal ud-suffix
- separated-suffix:
- suffix-separatoropt ud-suffix
- suffix-separator:
- ..
- ud-suffix:
- identifier
Edit paragraph 1 as follows.
If a token matches both user-defined-literal and another literal kind, it is treated as the latter. [Example:
123_km
and123.._km
is a user-defined-literalare user-defined-literals, but 123_456 and 12LLis an integer-literalare integer-literals. —end example] The syntactic non-terminal preceding the ud-suffix or separated-suffix in a user-defined-literal is taken to be the longest sequence of characters that could match that non-terminal.
Add a new section as follows. Editor: please incorporate with N3652.
Add the new text block below.
2.14 [lex.literal]
Change: Digit separator support.
Rationale: Required for new features.
Effect on original feature: Valid C++ 2011 code may change meaning, and hence possibly fail to compile, in this International Standard. A user-defined literal suffix that begins with an underscore followed by a character that may be interpreted as a digit within the context of the enclosing literal may change meaning. For example,
10_10
changes from integer10
with a suffix of_10
to an integer1010
. The original meaning can be restored with10.._10
. The literal0x1234_goo
has suffix_goo
but the literal0x1234_foo
has suffixoo
. The0x1234.._foo
has suffix_foo
.