1. Revision history
1.1. Changes since R4
-
Include preliminary wording
-
Revamp argument handling and erasure machinery to better elide copies and moves
-
Redefine
,make_scan_result
, andmake_scan_args
, addscan - arg - store
.fill_scan_result
-
-
Add discussion on the name
.scan -
Make fill+align logic more lenient and easy to understand.
-
Add precision-specifier to specify maximum field width.
-
Modify width-specifier to specify minimum field width.
-
-
Remove
success state, replace withscan_error
.expected < void , scan_error > -
Split
into a four separate enumerators, for positive and negative overflow and underflow.scan_error :: value_out_of_range -
Add
,scan_error :: invalid_literal
, andscan_error :: invalid_fill
, all of which were previously covered byscan_error :: length_too_short
.scan_error :: invalid_scanned_value -
Revise error handling in
.scanner :: parse -
now returnsscanner :: parse
, instead ofiterator
.expected < iterator , scan_error > -
Add
.scan_format_string_error
-
-
Rename
->scan_error :: end_of_range
.scan_error :: end_of_input -
Add parsing of pointers (
andvoid *
).const void * -
Remove requirement for localized numbers to have "correct" digit grouping as specified by
.numpunct :: grouping -
Remove design discussion on a dedicated flag for thousands separators (
'
), separate from locale. -
Remove detailed design discussion on error handling alternatives.
-
Update example on user-defined type scanning.
-
Clarify meaning of "whitespace" further in § 4.2 Format strings.
-
Fix example claiming
throws on an expected containing an error.std :: expected :: operator -> -
SG9: Make
exposition-only (borrowed_tail_subrange_t
).borrowed - tail - subrange - t -
Make concept
exposition-only (scannable_range
).scannable - range -
SG9: Add requirement to
for thescannable - range
to either bevalue_type
orchar
.wchar_t
-
-
Formatting and styling fixes.
1.2. Changes since R3
-
Replace
withscan_args_for
andscan_args
for consistency withwscan_args
.std :: format -
Rename
toborrowed_ssubrange_t
partly based on the naming from ranges-v3 (borrowed_tail_subrange_t
).tail_view -
Replace
withformat_string
, with ascan_format_string
template parameter.Range -
Enables compile-time checking for compatibility of the source range, and arguments to scan
-
-
Make
(the return types of[ v ] scan_result_type
andstd :: scan
) exposition only.std :: vscan -
Remove
: follow [P2637] and usevisit_scan_arg
, instead.std :: variant :: visit -
Add discussion on
support, guided by SG9 polls.stdin -
Make encoding errors be errors for strings, instead of garbage-in-garbage-out.
-
Add further discussion on field widths.
-
Add example as rationale for mandating
.forward_range
1.3. Changes since R2
-
Return a
fromsubrange
, instead of just an iterator: discussion in § 4.5 Argument passing, and return type of scan.scan -
Default
toCharT
inchar
for consistency withscanner
(previously no default forformatter
).CharT -
Add design discussion about thousands separators.
-
Add design discussion about additional error information.
-
Add clarification about field width calculation in § 4.3.4 Width and precision.
-
Add note about scope at the end of § 2 Introduction.
-
Fix/clarify error handling in example § 3.5 Alternative error handling.
-
Address SG16 feedback:
-
Add definition of "whitespace", and clarify matching of non-whitespace literal characters, in § 4.2 Format strings.
-
Add section about text encoding § 4.11 Encoding, and an example about handing reading code units § 4.3.8 Type specifiers: CharT.
-
Add example about using locales in § 4.10 Locales.
-
Add potential future extension: § 6.3 Reading code points (or even grapheme clusters?)
-
1.4. Changes since R1
-
Thoroughly describe the design
-
Add examples
-
Add specification (synopses only)
-
Design changes:
-
Return an
containing aexpected
fromtuple
, instead of using output parametersstd :: scan -
Make
take a range instead of astd :: scan string_view -
Remove support for partial successes
-
2. Introduction
With the introduction of
[P0645],
standard C++ has a convenient, safe, performant, extensible,
and elegant facility for text formatting,
over
and the
-family of functions.
The story is different for simple text parsing: the standard only
provides
and the
family, both of which have issues.
This asymmetry is also arguably an inconsistency in the standard library.
According to [CODESEARCH], a C and C++ codesearch engine based on the ACTCD19
dataset, there are 389,848 calls to
and 87,815 calls to
at
the time of writing. So although formatted input functions are less popular than
their output counterparts, they are still widely used.
The lack of a general-purpose parsing facility based on format strings has been raised in [P1361] in the context of formatting and parsing of dates and times.
This paper proposes adding a symmetric parsing facility,
, to
complement
. This facility is based on the same design principles
and shares many features with
.
This facility is not a parser per se, as it is probably not sufficient
for parsing something more complicated, e.g. JSON.
This is not a parser combinator library.
This is intended to be an almost-drop-in replacement for
,
capable of being a building block for a more complicated parser.
3. Examples
3.1. Basic example
if ( auto result = std :: scan < std :: string , int > ( "answer = 42" , "{} = {}" )) { // ~~~~~~~~~~~~~~~~ ~~~~~~~~~~~ ~~~~~~~ // output types input format // string const auto & [ key , value ] = result -> values (); // ~~~~~~~~~~ // scanned // values // result is a std::expected<std::scan_result<...>>. // result->range() gives an empty range. // result->begin() == result->end() // key == "answer" // value == 42 } else { // We would end up here if we had an error. std :: scan_error error = result . error (); }
3.2. Reading multiple values at once
auto input = "25 54.32E-1 Thompson 56789 0123" ; auto result = std :: scan < int , float , string_view , int , float , int > ( input , "{:d}{:f}{:9}{:2i}{:g}{:o}" ); // result is a std::expected, value() will throw if it doesn't contain a value auto [ i , x , str , j , y , k ] = result . value (). values (); // i == 25 // x == 54.32e-1 // str == "Thompson" // j == 56 // y == 789.0 // k == 0123
3.3. Reading from a range
std :: string input { "123 456" }; if ( auto result = std :: scan < int > ( std :: views :: reverse ( input ), "{}" )) { // If only a single value is returned, it can be accessed with result->value() // result->value() == 654 }
3.4. Reading multiple values in a loop
std :: vector < int > read_values ; std :: ranges :: forward_range auto range = ...; auto input = std :: ranges :: subrange { range }; while ( auto result = std :: scan < int > ( input , "{}" )) { read_values . push_back ( result -> value ()); input = result -> range (); }
3.5. Alternative error handling
// Since std::scan returns a std::expected, // its monadic interface can be used auto result = std :: scan < int > (..., "{}" ) . transform ([]( auto result ) { return result . value (); }); if ( ! result ) { // handle error } int num = * result ; // With [ P2561 ]: int num = std :: scan < int > (..., "{}" ). try ? . value ();
3.6. Scanning a user-defined type
struct mytype { int a {}, b {}; }; // Specialize std::scanner to add support for user-defined types. template <> struct std :: scanner < mytype > { // Parse format string: only accept empty format strings template < typename ParseContext > constexpr auto parse ( ParseContext & pctx ) -> typename ParseContext :: iterator { return pctx . begin (); } // Scan the value from `ctx`: // delegate to `std::scan` template < typename Context > auto scan ( mytype & val , Context & ctx ) const -> std :: expected < typename Context :: iterator , std :: scan_error > { return std :: scan < int , int > ( ctx . range (), "[{}, {}]" ) . transform ([ & val ]( const auto & result ) { std :: tie ( val . a , val . b ) = result . values (); return result . begin (); }); } }; auto result = std :: scan < mytype > ( "[123, 456]" , "{}" ); // result->value().a == 123 // result->value().b == 456
4. Design
The new parsing facility is intended to complement the existing C++ I/O streams
library, integrate well with the chrono library, and provide an API similar to
. This section discusses the major features of its design.
4.1. Overview
The main user-facing part of the library described in this paper,
is the function template
, the input counterpart of
.
The signature of
is as follows:
template < class ... Args , scannable - range < char > Range > auto scan ( Range && range , scan_format_string < Range , Args ... > fmt ) -> expected < scan_result < borrowed - tail - subrange - t < Range > , Args ... > , scan_error > ; template < class ... Args , scannable - range < wchar_t > Range > auto scan ( Range && range , wscan_format_string < Range , Args ... > fmt ) -> expected < scan_result < borrowed - tail - subrange - t < Range > , Args ... > , scan_error > ;
reads values of type
from the
it’s given,
according to the instructions given to it in the format string,
.
returns a
, containing either a
, or a
.
The
object contains a
pointing to the unparsed input,
and a
of
, containing the scanned values.
4.1.1. Naming of the function scan
The proposed name for the function
has caused some dissent, namely in the FP and HPC circles.
They argue, that
is the name of an algorithm, which is also already in the standard library,
in the form of
and
: Wikipedia: Prefix sum cppreference.com: std::inclusive_scan.
However, the aforementioned algorithm doesn’t have exclusive ownership of the name
.
is an extremely common name for the operation proposed in this paper,
and has very long-standing precendent in the C and C++ standard libraries
in the form of the
family of functions.
An alternative often thrown around is the name
. There are two problems with that name:
-
is a larger land-grab thanparse
, and is potentially misleading. The facility proposed in this paper is NOT a parser combinator library, but something closer to ascan
replacement, with a more limited scope.scanf -
is already a term used in this paper, and inparse
: it’s used to describe the action of format string parsing. It’s found in the member functionstd :: format
/std :: formatter :: parse
, and in the class templatesstd :: scanner :: parse
/std :: basic_format_parse_context
. The member functions doing the actual formatting instd :: basic_scan_parse_context
andformatter
are called the same as the public interface functions:scanner
andformat
, respectively. Werescan
be calledstd :: scan
, it’s unclear whatstd :: parse
,std :: scanner
,std :: scanner :: parse
, andstd :: scanner :: scan
should be called.std :: basic_scan_parse_context
4.2. Format strings
As with
, the
syntax has the advantage of being familiar to many
programmers. However, it has similar limitations:
-
Many format specifiers like
,hh
,h
,l
, etc. are used only to convey type information. They are redundant in type-safe parsing and would unnecessarily complicate specification and parsing.j -
There is no standard way to extend the syntax for user-defined types.
-
Using
in a custom format specifier poses difficulties, e.g. for'%'
-like time parsing.get_time
Therefore, we propose a syntax based on
and [PARSE]. This syntax
employs
and
as replacement field delimiters instead of
. It
will provide the following advantages:
-
An easy-to-parse mini-language focused on the data format rather than conveying the type information
-
Extensibility for user-defined types
-
Positional arguments
-
Support for both locale-specific and locale-independent parsing (see § 4.10 Locales)
-
Consistency with
.std :: format
At the same time, most of the specifiers will remain quite similar to the ones
in
, which can simplify a, possibly automated, migration.
Maintaining similarity with
, for any literal non-whitespace character in
the format string, an identical character is consumed from the input range.
For whitespace characters, all available whitespace characters are consumed.
In this proposal, "whitespace" is defined to be the Unicode code points with the Pattern_White_Space property, as defined by UAX #31 (UAX31-R3a). Those code points are:
-
ASCII whitespace characters:
-
U+0009 (HORIZONTAL TABULATION
)'\t' -
U+000A (LINE FEED
)'\n' -
U+000B (VERTICAL TABULATION
)'\v' -
U+000C (FORM FEED
)'\f' -
U+000D (CARRIAGE RETURN
)'\r' -
U+0020 (SPACE
)' '
-
-
U+0085 (NEXT LINE)
-
U+200E (LEFT-TO-RIGHT MARK)
-
U+200F (RIGHT-TO-LEFT MARK)
-
U+2028 (LINE SEPARATOR)
-
U+2029 (PARAGRAPH SEPARATOR)
Unicode defines a lot of different things
in the realm of whitespace, all for different kinds of use cases.
The Pattern_White_Space-property is chosen for its stability (it’s guaranteed to not change),
and because its intended use is for classifying things that should be treated as
whitespace in machine-readable syntaxes.
is insufficient for usage in a Unicode world,
because it only accepts a single code unit as input.
auto r0 = std :: scan < char > ( "abcd" , "ab{}d" ); // r0->value() == 'c' auto r1 = std :: scan < string , string > ( "abc \n def" , "{} {}" ); const auto & [ s1 , s2 ] = r1 -> values (); // s1 == "abc", s2 == "def"
As mentioned above, the format string syntax consists of replacement fields
delimited by curly brackets (
and
).
Each of these replacement fields corresponds to a value to be scanned from the input range.
The replacement field syntax is quite similar to
, as can be seen below.
Elements that are in one but not the other are highlighted.
Note how the
syntax is mostly a subset of the
syntax,
except for the two added entries under type.
replacement field syntax
std-format-spec:fill-and-alignopt widthopt precisionoptopt typeopt
L fill-and-align:fillopt alignfill:any character other thanor
{
} align: one of
<
>
^ width:positive-integer
precision:nonnegative-integer
.
type: one of
a
A
b
B
c
d
e
E
f
F
g
G
i
o
p
P
s
u
x
X
?
replacement field syntax
std-format-spec:fill-and-alignopt signopt #optopt widthopt precisionopt
0 opt typeopt
L fill-and-align:fillopt alignfill:any character other thanor
{
} align: one of
<
>
^ sign: one of
+ space
- width:positive-integer
arg-idopt
{
} precision:nonnegative-integer
.
. arg-idopt
{
} type: one of
a
A
b
B
c
d
e
E
f
F
g
G
o
p
P
s
x
X
?
-
,rNN
for arbitrary-base integers (r/R stands for radix, as b/B is already taken)RNN -
for an Unicode code pointU -
for scanf-like set of characters[...] -
for regex/ ... /
These are currently not proposed. Some of these are mentioned in § 6 Future extensions.
4.3. Format string specifiers
Below is a somewhat detailed description of each of the specifiers
in a
replacement field.
This design attempts to maintain decent compatibility with
whenever practical, while also bringing in some ideas from
.
4.3.1. Manual indexing
Like
,
supports manual indexing of
arguments in format strings. If manual indexing is used,
all of the argument indices have to be spelled out.
Different from
, the same index can only be used once.
auto r = std :: scan < int , int , int > ( "0 1 2" , "{1} {0} {2}" ); auto [ i0 , i1 , i2 ] = r -> values (); // i0 == 1, i1 == 0, i2 == 2
4.3.2. Fill and align
fill-and-align:fillopt alignfill:any character other thanor
{
} align: one of
<
>
^
The fill and align options are valid for all argument types.
The fill character is denoted by the
-option, or if it is absent,
the space character
.
The fill character can be any single Unicode scalar value.
The field width is determined the same way as it is for
.
If an alignment is specified, the value to be parsed is assumed to be properly aligned with the specified fill character.
If a field width is specified, it will taken to be the minimum number of characters
to be consumed from the input range.
If a field precision is specified, it will taken to be the maximum number of characters
to be consumed from the input range.
If either field width or precision is specified, but no alignment is,
the default alignment for the type is considered (see
).
For the
alignment, fill characters both before and after the value
will be considered. The number of fill characters doesn’t have to be equal:
input will be parsed until either a non-fill character is encountered,
or the (maximum) field precision is exhausted, after which checking is done for the
(minimum) field width.
This spec is compatible with
,
i.e., the same format string (wrt. fill and align)
can be used with both
and
,
with round-trip semantics.
Note: For format type specifiers other than
(default for
and
, can be specified for
and
),
leading whitespace is skipped regardless of alignment specifiers.
auto r0 = std :: scan < int > ( " 42" , "{}" ); // r0->value() == 42, r0->range() == "" auto r1 = std :: scan < char > ( " x" , "{}" ); // r1->value() == ' ', r1->range() == " x" auto r2 = std :: scan < char > ( "x " , "{}" ); // r2->value() == 'x', r2->range() == " " auto r3 = std :: scan < int > ( " 42" , "{:6}" ); // r3->value() == 42, r3->range() == "" auto r4 = std :: scan < char > ( "x " , "{:6}" ); // r4->value() == 'x', r4->range() == "" auto r5 = std :: scan < int > ( "***42" , "{:*>}" ); // r5->value() == 42, r5->range() == "" auto r6 = std :: scan < int > ( "***42" , "{:*>5}" ); // r6->value() == 42, r6->range() == "" auto r7 = std :: scan < int > ( "***42" , "{:*>4}" ); // r7->value() == 42, r7->range() == "" auto r8 = std :: scan < int > ( "***42" , "{:*>.4}" ); // r8->value() == 4, r8->range() == "2" auto r9 = std :: scan < int > ( "***42" , "{:*>4.4}" ); // r9->value() == 4, r9->range() == "2" auto r10 = std :: scan < int > ( "42" , "{:*>}" ); // r10->value() == 42, r10->range() == "" auto r11 = std :: scan < int > ( "42" , "{:*>5}" ); // ERROR (length_too_short) auto r12 = std :: scan < int > ( "42" , "{:*>.5}" ); // r12->value() == 42, r12->range() == "" auto r13 = std :: scan < int > ( "42" , "{:*>5.5}" ); // ERROR (length_too_short) auto r14 = std :: scan < int > ( "42***" , "{:*<}" ); // r14->value() == 42, r14->range() == "" auto r15 = std :: scan < int > ( "42***" , "{:*<5}" ); // r15->value() == 42, r15->range() == "" auto r16 = std :: scan < int > ( "42***" , "{:*<4}" ); // r16->value() == 42, r16->range() == "*" auto r17 = std :: scan < int > ( "42***" , "{:*<.4}" ); // r17->value() == 42, r17->range() == "*" auto r18 = std :: scan < int > ( "42***" , "{:*<4.4}" ); // r18->value() == 42, r18->range() == "*" auto r19 = std :: scan < int > ( "42" , "{:*<}" ); // r19->value() == 42, r19->range() == "" auto r20 = std :: scan < int > ( "42" , "{:*<5}" ); // ERROR (length_too_short) auto r21 = std :: scan < int > ( "42" , "{:*<.5}" ); // r21->value() == 42, r19->range() == "" auto r22 = std :: scan < int > ( "42" , "{:*<5.5}" ); // ERROR (length_too_short) auto r23 = std :: scan < int > ( "42" , "{:*^}" ); // r23->value() == 42, r23->range() == "" auto r24 = std :: scan < int > ( "*42*" , "{:*^}" ); // r24->value() == 42, r24->range() == "" auto r25 = std :: scan < int > ( "*42**" , "{:*^}" ); // r25->value() == 42, r25->range() == "" auto r26 = std :: scan < int > ( "**42*" , "{:*^}" ); // r26->value() == 42, r26->range() == "" auto r27 = std :: scan < int > ( "**42**" , "{:*^6}" ); // r27->value() == 42, r27->range() == "" auto r28 = std :: scan < int > ( "*42**" , "{:*^5}" ); // r28->value() == 42, r28->range() == "" auto r29 = std :: scan < int > ( "**42*" , "{:*^5}" ); // r29->value() == 42, r29->range() == "" auto r30 = std :: scan < int > ( "**42*" , "{:*^6}" ); // ERROR (length_too_short) auto r31 = std :: scan < int > ( "**42*" , "{:*^.6}" ); // r31->value() == 42, r31->range() == "" auto r32 = std :: scan < int > ( "**42*" , "{:*^6.6}" ); // ERROR (length_too_short) auto r33 = std :: scan < int > ( "#*42*" , "{:*^}" ); // ERROR (invalid_scanned_value) auto r34 = std :: scan < int > ( "#*42*" , "#{:*^}" ); // r34->value() == 42, r34->range() == "" auto r35 = std :: scan < int > ( "#*42*" , "#{:#^}" ); // ERROR (invalid_scanned_value) auto r36 = std :: scan < int > ( "***42*" , "{:*^3}" ); // r36->value() == 42, r36->range() == "" auto r37 = std :: scan < int > ( "***42*" , "{:*^.3}" ); // ERROR (invalid_fill)
4.3.3. Sign, #
, and 0
std-format-spec:...signopt#opt...opt
0 sign: one of
+ space
-
These flags would have no effect in
, so they are disabled.
Signs (both
and
), base prefixes, trailing decimal points, and leading zeroes
are always allowed for arithmetic values.
Disabling them would be a bad default for a higher-level facility
like
, so flags explicitly enabling them are not needed.
Allowing them would just be misleading and lead to confusion about their behavior.
Note: This is incompatible with
format strings.
4.3.4. Width and precision
width:positive-integer
arg-idopt
{
} precision:nonnegative-integer
.
. arg-idopt
{
}
The width and precision specifiers are valid for all argument types.
Their meaning is virtually the same as with
:
the width specifies the minimum field width,
whereas the precision specifies the maximum.
The scanned value itself, and any fill characters are counted as a part
of said field width.
Either one of these can be specified to set either a minimum or a maximum, or both to provide a range of valid field widths.
Having a value shorter than the minimum field width is an error. Having a value longer than the maximum field width is not possible: reading will be cut short once the maximum field width is reached. If the value parsed up to that point is not a valid value, an error is provided.
// Minimum width of 2 auto r0 = std :: scan < int > ( "123" , "{:2}" ); // r0->value() == 123, r0->range() == "" // Maximum width of 2 auto r1 = std :: scan < int > ( "123" , "{:.2}" ); // r1->value() == 12, r1->range() == "3"
For compatibility with
,
the width and precision specifiers are in field width units,
which is specified to be 1 per Unicode (extended) grapheme cluster,
except some grapheme clusters are 2 ([format.string.std] ¶ 13):
For a sequence of characters in UTF-8, UTF-16, or UTF-32, an implementation should use as its field width the sum of the field widths of the first code point of each extended grapheme cluster. Extended grapheme clusters are defined by UAX #29 of the Unicode Standard. The following code points have a field width of 2:
any code point with the East_Asian_Width="W" or East_Asian_Width="F" Derived Extracted Property as described by UAX #44 of the Unicode Standard
U+4dc0 – U+4dff (Yijing Hexagram Symbols)
U+1f300 – U+1f5ff (Miscellaneous Symbols and Pictographs)
U+1f900 – U+1f9ff (Supplemental Symbols and Pictographs)
The field width of all other code points is 1.
For a sequence of characters in neither UTF-8, UTF-16, nor UTF-32, the field width is unspecified.
This essentially maps 1 field width unit = 1 user perceived character.
It should be noted, that with this definition, grapheme clusters like emoji have a field width of 2.
This behavior is present in
today, but can potentially be surprising to users.
This meaning for both the width and precision specifiers are different from
,
where the width means the number of code units to read.
This is because the purpose of that specifier in
is to prevent buffer overflow.
Because the current interface of the proposed
doesn’t allow reading into an user-defined buffer, this isn’t a concern.
Specifying the width with another argument, like in
, is disallowed.
4.3.5. Localized (L
)
std-format-spec:...opt ...
L
Enables scanning of values in locale-specific forms.
-
For integer types, allows for digit group separator characters, equivalent to
of the used locale. If digit group seaprator characters are used, their grouping doesnt' have to matchnumpunct :: thousands_sep
.numpunct :: grouping -
For floating-point types, the same as above. In addition, the locale-specific radix separator character is used, from
.numpunct :: decimal_point -
For
, the textual representation uses the appropriate strings frombool
andnumpunct :: truename
.numpunct :: falsename
4.3.6. Type specifiers: strings
Type | Meaning |
---|---|
none,
| Copies from the input until a whitespace character is encountered. |
| Copies an escaped string from the input. |
| Copies from the input until the field width is exhausted. Does not skip preceding whitespace. Errors, if no field width is provided. |
s
specifier is consistent with std :: istream
and std :: string
:
std :: string word ; std :: istringstream { "Hello world" } >> word ; // word == "Hello" auto r = std :: scan < string > ( "Hello world" , "{:s}" ); // r->value() == "Hello"
Note: The
specifier is consistent with
,
but is not supported for strings by
.
4.3.7. Type specifiers: integers
Integer values are scanned as if by using
, except:
-
A positive
sign and a base prefix are always allowed to be present.+ -
Preceding whitespace is skipped.
Type | Meaning |
---|---|
,
| with base 2. The base prefix is or .
|
| with base 8. For non-zero values, the base prefix is .
|
,
| with base 16. The base prefix is or .
|
| with base 10. No base prefix.
|
| with base 10. No base prefix. No sign allowed.
|
| Detect base from a possible prefix, default to decimal. |
| Copies a character from the input. |
none | Same as
|
Note: The flags
and
are not supported by
.
These flags are consistent with
.
Note: [SCNLIB] also supports the flag
for octal numbers,
and
and
as possible octal number prefixes.
These are currently not proposed.
4.3.8. Type specifiers: CharT
Type | Meaning |
---|---|
none,
| Copies a character from the input. |
, , , , , , ,
| Same as for integers. |
| Copies an escaped character from the input. |
CharT
with the c
type specifier
will just read a single code unit of type CharT
.
This can lead to invalid encoding in the scanned values.
// As proposed: // U+12345 is 0xF0 0x92 0x8D 0x85 in UTF-8 auto r = std :: scan < char , std :: string > ( "\u{12345}" , "{}{}" ); auto & [ ch , str ] = r -> values (); // ch == '\xF0' // str == "\x92\x8d\x85" (invalid utf-8) // This is the same behavior as with iostreams today
4.3.9. Type specifiers: bool
Type | Meaning |
---|---|
| Allows for textual representation, i.e. true or false
|
, , , , , , ,
| Allows for integral representation, i.e. or
|
none | Allows for both textual and integral representation: i.e. true , , false , or .
|
4.3.10. Type specifiers: floating-point types
Similar to integer types,
floating-point values are scanned as if by using
, except:
-
A positive
sign is always allowed to be present.+ -
Preceding whitespace is skipped.
Type | Meaning |
---|---|
,
| with , with / -prefix allowed.
|
,
| with .
|
,
| with .
|
,
| with .
|
none | with , with / -prefix allowed.
|
4.3.11. Type specifiers: pointers
supports formatting pointers of type
and
.
For consistency’s sake,
also supports reading a
or
.
Unlike
,
is not supported.
Type | Meaning |
---|---|
none, ,
| as if by reading a value of type with the type specifier
|
4.4. Ranges
We propose, that
would take a range as its input.
This range should satisfy the requirements of
to
enable look-ahead, which is necessary for parsing.
template < class Range , class CharT > concept scannable - range = ranges :: forward_range < Range > && same_as < ranges :: range_value_t < Range > , CharT > && ( same_as < CharT , char > || same_as < CharT , wchar_t > );
For a range to be a
, its character type (range
, code unit type)
needs to also be correct, i.e. it needs to match the character type of the format string.
Mixing and matching character types between the input range and the format string is not supported.
scan < int > ( "42" , "{}" ); // OK scan < int > ( L"42" , L"{}" ); // OK scan < int > ( L"42" , "{}" ); // Error: wchar_t\[N] is not a scannable-range<char>
It should be noted, that standard range facilities related to iostreams, namely
, model
.
Thus, they can’t be used with
, and therefore, for example,
, can’t be read directly using
.
The reference implementation deals with this by providing a range type, that wraps a
, and provides a
-compatible interface to it.
At this point, this is deemed out of scope for this proposal.
As mentioned above,
s are needed to support proper lookahead and rollback.
For example, when reading an
with the
format specifier (detect base from prefix),
whether a character is part of the
can’t be determined before reading past it.
// Hex value "0xf" auto r1 = std :: scan < int > ( "0xf" , "{:i}" ); // r1->value() == 0xf // r1->range().empty() == true // (Octal) value "0", with "xg" left over auto r2 = std :: scan < int > ( "0xg" , "{:i}" ); // r2->value() == 0 // r2->range() == "xg" // Compare with sscanf: int val {}, n {}; int r = std :: sscanf ( "0xf" , "%i%n" , & val , & n ); // val == 0xf // n == 3 -> remainder == "" // r == 1 -> SUCCESS r = std :: sscanf ( "0xg" , "%i%n" , & val , & n ); // val == 0 // n == 2 -> remainder == "g" // r == 1 -> SUCCESS
The same behavior can be observed with floating-point values, when using exponents:
whether
is parsed as a number, or as
with the rest left over,
depends on whether
is a valid exponent.
For user-defined types, arbitrarily-long lookback or rollback can be required.
4.5. Argument passing, and return type of scan
is proposed to return the values it scans, wrapped in a
.
auto result = std :: scan < int > ( input , "{}" ); auto [ i ] = result -> values (); // or (only a single scanned value): auto i = result -> value ();
The rationale for this is as follows:
-
With output parameters, it would be easy to accidentally use uninitialized values. With return values, the values can only be accessed when the operation is successful.
-
Modern C++ API design principles favor return values over output parameters.
It should be noted, that not using output parameters removes a channel for user customization.
For example, [FMT] uses
to specify named arguments,
and
for easy formatting of enumerators.
The same isn’t directly possible here, without customizing the type to be scanned itself.
The return type of
,
, contains a
over the unparsed input.
This can be accessed with the member function
.
This is done with an exposition-only type alias,
, that is defined as follows:
template < typename R > using borrowed - tail - subrange - t = std :: conditional_t < ranges :: borrowed_range < R > , ranges :: subrange < ranges :: iterator_t < R > , ranges :: sentinel_t < R >> , ranges :: dangling > ;
Compare this with
, which is defined as
,
when the range models
.
This kind of subrange is returned to avoid having to advance to the of the range
in order to return an iterator pointing to it: we can just return the sentinel we’re given, instead.
In addition to a subrange, as pointed out above, the success side of the returned expected
also contains a
of the scanned values.
This tuple can be retrieved with the
member function,
or if there’s only a single scanned value, also with
.
4.5.1. Design alternatives
As proposed,
returns an
, containing either an iterator and a tuple, or a
.
An alternative could be returning a
, with a result object as its first (0th) element, and the parsed values occupying the rest.
This would enable neat usage of structured bindings:
// NOT PROPOSED, design alternative auto [ r , i ] = std :: scan < int > ( "42" , "{}" );
However, there are two possible issues with this design:
-
It’s easy to accidentally skip checking whether the operation succeeded, and access the scanned values regardless. This could be a potential security issue (even though the values would always be at least value-initialized, not default-initialized). Returning an expected forces checking for success.
-
The numbering of the elements in the returned tuple would be off-by-one compared to the indexing used in format strings:
auto r = std :: scan < int > ( "42" , "{0}" ); // std::get<0>(r) refers to the result object // std::get<1>(r) refers to {0}
For the same reason as enumerated in 2. above, the
type as proposed doesn’t follow the tuple protocol, so that structured bindings can’t be used with it:
// NOT PROPOSED auto result = std :: scan < int > ( "42" , "{0}" ); // std::get<0>(*result) would refer to the iterator // std::get<1>(*result) would refer to {0}
4.6. Error handling
Contrasting with
, this proposed library communicates errors with return values,
instead of throwing exceptions. This is because error conditions are expected to be much
more frequent when parsing user input, as opposed to text formatting.
With the introduction of
, error handling using return values is also more ergonomic than before,
and it provides a vocabulary type we can use here, instead of designing something novel.
holds an enumerated error code value, and a message string.
The message is used in the same way as the message in
:
it gives more details about the error, but its contents are unspecified.
// Not a specification, just exposition class scan_error { public : enum code { // Tried to read from an empty range, // or the input ended unexpectedly. end_of_input , // The format string was invalid: // This will often be caught at compile time, // except when using `std::runtime_format`. invalid_format_string , // A generic error, for when the input // did not contain a valid representation // for the type to be scanned. invalid_scanned_value , // Literal character specified in the format string // was not found in the source. invalid_literal , // Too many fill characters scanned, // field precision (maximum field width) exceeded. invalid_fill , // Scanned field width was shorter than // what was specified as the minimum field width. length_too_short , // Value too large (higher than the maximum value) value_positive_overflow , // Value too small (lower than the minimum value) value_negative_overflow , // Value magnitude too small, sign + // (between 0 and the smallest subnormal) value_positive_underflow , // Value magnitude too small, sign - // (between 0 and the smallest subnormal) value_negative_underflow }; constexpr scan_error ( enum code , const char * ); constexpr auto code () const noexcept -> enum code ; constexpr const char * msg () const ; };
Note: [SCNLIB] has an additional error code enumerator,
.
It’s currently used when the input is not a range, but something like a file or an
.
As these kinds of input are currently not supported with this proposal, this is not proposed.
Note: A previous revision of this proposal had fewer enumerators,
with the overflow/underflow enumerators being one
,
and
,
, and
being folded into
.
The added granularity provided in this revision was found to be useful.
The reason why we propose adding the type
instead of just using
is,
that we want to avoid losing information. The enumerators of
are insufficient for
this use, as evident by the table below: there are no clear one-to-one mappings between
and
, but
would need to cover a lot of cases.
Also,
has a lot of unnecessary error codes, and a
The
in
is extremely useful for user code, for use in logging and debugging.
Even with the
enumerators, more information is often needed, to isolate any possible problem.
Possible mappings from
to
could be:
|
|
---|---|
|
|
| |
| |
| |
| |
| |
|
|
| |
| |
|
Note: [SCNLIB] provides a member function,
,
that performs this mapping.
Currently, as proposed, the message contained in a
is of type
.
Additionally, the validity of this message is only guaranteed up until the next call to a scanning function.
This allows for performant use of string literals, but also leaves the opportunity for the implementation
to do interesting things, for example by using thread-local storage to construct a custom error message,
without allocating or using a
. Using
here would needlessly bloat up the type,
both in terms of its size and its performance.
[SCNLIB] currently only uses string literals for its error messages,
except when a user-defined
throws a
,
for which TLS is utilized. See § 4.9 Extensibility below for more details.
4.7. Binary footprint and type erasure
We propose using a type erasure technique to reduce the per-call binary code size. The scanning function that uses variadic templates can be implemented as a small inline wrapper around its non-variadic counterpart:
template < scannable - range < char > Range > auto vscan ( Range && range , string_view fmt , scan_args args ) -> expected < ranges :: borrowed - tail - subrange - t < Range > , scan_error > ; template < typename ... Args , scannable - range < char > SourceRange > auto scan ( SourceRange && source , scan_format_string < Range , Args ... > format ) -> expected < scan_result < ranges :: borrowed - tail - subrange - t < SourceRange > , Args ... > , scan_error > { auto result = make_scan_result < Source , Args ... > (); fill_scan_result ( result , vscan ( std :: forward < SourceRange > ( range ), format , make_scan_args ( result -> values ()))); return result ; }
As shown in [P0645] this dramatically reduces binary code size, which will make
comparable to
on this metric.
type erases the arguments that are to be scanned.
This is similar to
, used with
.
returns a default-constructed
,
containing an empty subrange and a tuple of value-initialized arguments.
This is the value that will be returned from
.
The values will be populated by
, which will be given a reference to these values
through the type-erased
.
The subrange will be set by
, which is described below.
This approach allows us to take advantage of NRVO,
which will eliminate copies and moves of the scan argument tuple out of
into the caller’s scope.
takes the return value of
,
and either writes the leftover range indicated by it into
, or writes an error.
It’s essentially one-liner sugar for this:
void fill_scan_result ( auto & result , auto && vscan_result ) { // skipping type checking if ( vscan_result ) { result -> set - range ( * vscan_result ); } else { result = unexpected ( vscan_result . error ()); } }
Note: This implementation of
is more complicated
compared to
, which can be described as a one-liner calling
.
This is because the arguments that are written to by
need to outlive the call to
,
so that they can be safely returned from
.
A previous revision of this proposal used a different approach to type erasure and
the implementation of
. In that approach,
would store both a
of scanning arguments,
and an array of
s, that erased these arguments. Then, after calling
,
the return object would be constructed by moving the
into it.
This had comparatively very bad codegen and performance for non-trivially copyable types, as copying or moving them on return couldn’t be elided. Compare this to the current approach, where we don’t have an intermediary tuple, but construct the return object straight away, and write directly to it.
4.8. Safety
is arguably more unsafe than
because
([ATTR]) implemented by GCC and Clang
doesn’t catch the whole class of buffer overflow bugs, e.g.
char s [ 10 ]; std :: sscanf ( input , "%s" , s ); // s may overflow.
Specifying the maximum length in the format string above solves the issue but is error-prone, especially since one has to account for the terminating null.
Unlike
, the proposed facility relies on variadic templates instead of
the mechanism provided by
. The type information is captured
automatically and passed to scanners, guaranteeing type safety and making many of
the
specifiers redundant (see § 4.2 Format strings). Memory management is
automatic to prevent buffer overflow errors.
4.9. Extensibility
We propose an extension API for user-defined types similar to
,
used with
. It separates format string processing and parsing, enabling
compile-time format string checks, and allows extending the format specification
language for user types. It enables scanning of user-defined types.
auto r = scan < tm > ( input , "Date: {0:%Y-%m-%d}" );
This is done by providing a specialization of
for
:
template <> struct scanner < tm > { template < class ParseContext > constexpr auto parse ( ParseContext & ctx ) -> typename ParseContext :: iterator ; template < class ScanContext > auto scan ( tm & t , ScanContext & ctx ) const -> expected < typename ScanContext :: iterator , scan_error > ; };
The
function parses the
portion of the format
string corresponding to the current argument, and
parses the
input range
and stores the result in
.
An implementation of
can potentially use the istream extraction
for user-defined type
, if available.
Error handling in
differs from the other parts of this proposal.
To facilitate better compile time error checking,
doesn’t return an
.
Instead, to report errors, it can throw an exception of type
,
which is an exception type derived from
.
Then, if
is being executed at compile time, and it throws,
it makes the program ill-formed (
is not constant expression).
This also makes the compiler error message easy to read, as it’ll point right where
the
expression is, with the error description.
If
is executed at run time, the exception is caught in the library,
and eventually returned from
inside a
, with the error code of
.
A previous revision of this paper proposed returning
from
.
While consistent with
, it had the issue of diminished quality of compiler error messages.
Returning an
value from
was not a compile-time error onto itself,
so the compile-time error only manifested from inside the library, where it no longer
had access to the original context and error message.
By
ing, the compiler can point literally to the very line of code that reported the error.
Note: [SCNLIB] supports an additional means of error reporting from
.
has a member function,
, that’s not
.
This is useful for customers who aren’t using exceptions, but it’s not proposed in this paper.
4.10. Locales
As pointed out in [N4412]:
There are a number of communications protocol frameworks in use that employ text-based representations of data, for example XML and JSON. The text is machine-generated and machine-read and should not depend on or consider the locales at either end.
To address this,
provided control over the use of locales. We propose
doing the same for the current facility by performing locale-independent parsing
by default and designating separate format specifiers for locale-specific ones.
In particular, locale-specific behavior can be opted into by using the
format specifier, and supplying a
object.
std :: locale :: global ( std :: locale :: classic ()); // {} uses no locale // {:L} uses the global locale auto r0 = std :: scan < double , double > ( "1.23 4.56" , "{} {:L}" ); // r0->values(): (1.23, 4.56) // {} uses no locale // {:L} uses the supplied locale auto r1 = std :: scan < double , double > ( std :: locale { "fi_FI" }, "1.23 4,56" , "{} {:L}" ); // r1->values(): (1.23, 4.56)
4.11. Encoding
In a similar manner as with
, input given to
is assumed
to be in the (ordinary/wide) literal encoding.
If an error in encoding is encountered while reading a value of a string type
(
,
), an
error is returned.
For other types, the reading is stopped, as the parser can’t parse a numeric value from
something that isn’t digits, indirectly causing an error.
// Invalid UTF-8 auto r = std :: scan < std :: string > ( "a \xc3 " , "{}" ); // r == false // r->error() == std::scan_error::invalid_scanned_value auto r2 = std :: scan < int > ( "1 \xc3 " , "{}" ); // r2 == true // r2->value() == 1 // r2->range() == "\xc3 "
Reading raw bytes (not in the literal encoding) into a
isn’t directly supported.
This can be achieved either with simpler range algorithms already in the standard,
or by using a custom type or scanner.
4.12. Performance
The API allows efficient implementation that minimizes virtual function calls
and dynamic memory allocations, and avoids unnecessary copies. In particular,
since it doesn’t need to guarantee the lifetime of the input across multiple
function calls,
can take
avoiding an extra string copy
compared to
. Since, in the default case, it also doesn’t
deal with locales, it can internally use something like
.
We can also avoid unnecessary copies required by
when parsing strings,
e.g.
auto r = std :: scan < std :: string_view , int > ( "answer = 42" , "{} = {}" );
Because the format strings are checked at compile time, while being aware
of the exact types to scan, and the source range type, it’s possible to check
at compile time, whether scanning a
would dangle, or if it’s
possible at all (reading from a non-
).
4.13. Integration with chrono
The proposed facility can be integrated with
([P0355])
via the extension mechanism, similarly to the integration between chrono and text
formatting proposed in [P1361]. This will improve consistency between parsing
and formatting, make parsing multiple objects easier, and allow avoiding dynamic
memory allocations without resolving to the deprecated
.
Before:
std :: istringstream is ( "start = 10:30" ); std :: string key ; char sep ; std :: chrono :: seconds time ; is >> key >> sep >> std :: chrono :: parse ( "%H:%M" , time );
After:
auto result = std :: scan < std :: string , std :: chrono :: seconds > ( "start = 10:30" , "{0} = {1:%H:%M}" ); const auto & [ key , time ] = result -> values ();
Note that the
version additionally validates the separator.
Scanning of time points, clock values, and calendar values is implemented in [SCNLIB].
4.14. Impact on existing code
The proposed API is defined in a new header and should have no impact on existing code.
5. Existing work
[SCNLIB] is a C++ library that serves as the reference implementation of this proposal. Its interface and behavior follows the design described in this paper.
[FMT] has a prototype implementation of an earlier version of the proposal.
6. Future extensions
To keep the scope of this paper somewhat manageable, we’ve chosen to only include functionality we consider fundamental. This leaves the design space open for future extensions and other proposals. However, we are not categorically against exploring this design space, if it is deemed critical for v1.
All of the possible future extensions described below are implemented in [SCNLIB].
6.1. Integration with stdio
In the SG9 meeting in Kona (11/2023), it was polled, that:
SG9 feels that it essential for std::scan to be useable with stdin and cin (and the paper would be incomplete without this feature).
SF F N A SA 0 5 1 3 0
We’ve decided to follow the route of
+
,
i.e. to not complicate and bloat this paper further by involving I/O.
This is still an important avenue of future expansion,
and the library proposed in this paper is designed and specified in such a way
as to easily allow that expansion.
[SCNLIB] implements this by providing a function,
,
for interfacing with
, and by allowing passing in
s as input
to
, in addition to
s.
6.2. scanf
-like [ character set ]
matching
supports the
format specifier, which allows for matching for a set of accepted
characters. Unfortunately, because some of the syntax for specifying that set is
implementation-defined, the utility of this functionality is hampered.
Properly specified, this could be useful.
auto r = scan < string > ( "abc123" , "{:[a-zA-Z]}" ); // r->value() == "abc", r->range() == "123" // Compare with: char buf [ N ]; sscanf ( "abc123" , "%[a-zA-Z]" , buf ); // ... auto _ = scan < string > (..., "{:[^ \n ]}" ); // match until newline
It should be noted, that while the syntax is quite similar, this is not a regular expression. This syntax is intentionally way more limited, as is meant for simple character matching.
This syntax is actually very useful when doing a little more complicated parsing, but it’s still left out for the interest of scope.
[SCNLIB] implements this syntax, providing support for matching single characters/code points
(
) and code point ranges (
).
Full regex matching is also supported with
.
6.3. Reading code points (or even grapheme clusters?)
in nowadays the type denoting a Unicode code point.
Reading individual code points, or even Unicode grapheme clusters, could be a useful feature.
Currently, this proposal only supports reading of individual code units (
or
).
[SCNLIB] supports reading Unicode code points with
.
6.4. Reading strings and chars of different width
In C++, we have character types other than
and
, too:
namely
,
, and
.
Currently, this proposal only supports reading strings with the same
character type as the input range, and reading
characters from
narrow
-oriented input ranges, as does
.
somewhat supports this with the
-flag (and the absence of one in
).
Providing support for reading differently-encoded strings could be useful.
// Currently supported: auto r0 = scan < wchar_t > ( "abc" , "{}" ); // Not supported: auto r1 = scan < char > ( L"abc" , L"{}" ); auto r2 = scan < string , wstring , u8string , u16string , u32string > ( "abc def ghi jkl mno" , "{} {} {} {} {}" ); auto r3 = scan < string , wstring , u8string , u16string , u32string > ( L"abc def ghi jkl mno" , L"{} {} {} {} {}" );
6.5. Scanning of ranges
Introduced in [P2286] for
, enabling the user to use
to scan ranges, could be useful.
6.6. Default values for scanned values
Currently, the values returned by
are value-constructed,
and assigned over if a value is read successfully.
It may be useful to be able to provide an initial value different from a value-constructed
one, for example, for preallocating a
, and possibly reusing it:
string str ; str . reserve ( n ); auto r0 = scan < string > (..., "{}" , { std :: move ( str )}); // ... r0 -> value (). clear (); auto r1 = scan < string > (..., "{}" , { std :: move ( r0 -> value ())});
This same facility could be also used for additional user customization, as pointed out in § 4.5 Argument passing, and return type of scan.
6.7. Assignment suppression / discarding values
supports discarding scanned values with the
specifier in the format string. [SCNLIB] provides similar functionality through a special type,
:
7. Specification
This wording is still quite preliminary, and will require more work. Note the similarity and referencing to [format] in some parts.
This wording is done relative to [N4988].
7.1. General
Add the header
to the appropriate place in the "C++ library headers" table in [headers],
respecting alphabetical order.
Add an entry for
to the appropriate place in [version.syn],
respecting alphabetical order. Set the value of the macro to the date of adoption of the paper.
#define __cpp_lib_scan 20XXXXL // also in <scan>
7.2. Scanning [scan]
7.2.1. Header < scan >
synopsis [scan.syn]
namespace std { // [scan.fmt.string], class template basic_scan_format_string template < class charT , class Range , class ... Args > struct basic_scan_format_string ; template < class Range , class ... Args > using scan_format_string = basic_scan_format_string < char , type_identity_t < Range > , type_identity_t < Args > ... > ; template < class Range , class ... Args > using wscan_format_string = basic_scan_format_string < wchar_t , type_identity_t < Range > , type_identity_t < Args > ... > ; // [scan.error], class scan_error class scan_error ; // [scan.format.error], class scan_format_string_error class scan_format_string_error ; // [scan.result.result], class template scan_result template < class Range , class ... Args > class scan_result ; template < ranges :: range R > using borrowed - tail - subrange - t = conditional_t < ranges :: borrowed_range < R > , ranges :: subrange < ranges :: iterator_t < R > , ranges :: sentinel_t < R >> , ranges :: dangling > ; // exposition only template < class Range , class ... Args > using scan - result - type = expected < scan_result < borrowed - tail - subrange - t < Range > , Args ... > , scan_error > ; // exposition only // [scan.result], result types template < class Source , class ... Args > constexpr scan - result - type < Range , Args ... > make_scan_result (); template < class Result , class Range > constexpr void fill_scan_result ( expected < Result , scan_error >& out , expected < Range , scan_error >&& in ); template < class Range , class charT > concept scannable - range = ranges :: forward_range < Range > && same_as < ranges :: range_value_t < Range > , charT > && ( same_as < charT , char > || same_as < charT , wchar_t > ); // exposition only // [scan.functions], scanning functions template < class ... Args , scannable - range < char > Range > scan - result - type < Range , Args ... > scan ( Range && range , scan_format_string < Range , Args ... > fmt ); template < class ... Args , scannable - range < wchar_t > Range > scan - result - type < Range , Args ... > scan ( Range && range , wscan_format_string < Range , Args ... > fmt ); template < class ... Args , scannable - range < char > Range > scan - result - type < Range , Args ... > scan ( const locale & loc , Range && range , scan_format_string < Range , Args ... > fmt ); template < class ... Args , scannable - range < wchar_t > Range > scan - result - type < Range , Args ... > scan ( const locale & loc , Range && range , wscan_format_string < Range , Args ... > fmt ); template < class Range > using vscan - result - type = expected < borrowed - tail - subrange - t < Range > , scan_error > ; // exposition only template < scannable - range < char > Range > vscan - result - type < Range > vscan ( Range && range , string_view fmt , scan_args args ); template < scannable - range < wchar_t > Range > vscan - result - type < Range > vscan ( Range && range , wstring_view fmt , wscan_args args ); template < scannable - range < char > Range > vscan - result - type < Range > vscan ( const locale & loc , Range && range , string_view fmt , scan_args args ); template < scannable - range < wchar_t > Range > vscan - result - type < Range > vscan ( const locale & loc , Range && range , wstring_view fmt , wscan_args args ); // [scan.context], class template basic_scan_context template < class Range , class charT > class basic_scan_context ; using scan_context = basic_scan_context < unspecified , char > ; using wscan_context = basic_scan_context < unspecified , wchar_t > ; // [scan.scanner], class template scanner template < class T , class charT = char > struct scanner ; // [scan.scannable], concept scannable template < class T , class charT > concept scannable = see below ; // [scan.parse.ctx], class template basic_scan_parse_context template < class charT > class basic_scan_parse_context ; using scan_parse_context = basic_scan_parse_context < char > ; using wscan_parse_context = basic_scan_parse_context < wchar_t > ; // [scan.args], class template basic_scan_args template < class Context > class basic_scan_args ; using scan_args = basic_scan_args < scan_context > ; using wscan_args = basic_scan_args < wscan_context > ; // [scan.arg], class template basic_scan_arg template < class Context > class basic_scan_arg ; // [scan.arg.store], class template scan-arg-store template < class Context , class ... Args > class scan - arg - store ; // exposition only template < class Context = scan_context , class ... Args > constexpr scan - arg - store < Context , Args ... > make_scan_args ( std :: tuple < Args ... >& args ); template < class ... Args > constexpr scan - arg - store < wscan_context , Args ... > make_wscan_args ( std :: tuple < Args ... >& args ); }
7.2.2. Format string [scan.string]
7.2.2.1. General [scan.string.general]
A format string for arguments
is a (possibly empty) sequence of replacement fields, escape sequences, whitespace characters,
and characters other than
and
. Each character that is not
part of a replacement field or an escape sequence,
and is not whitespace character, is matched with a character in the input.
An escape sequence is one of
or
.
It is matched with
or
, respectively, in the input.
For a sequence of characters in UTF-8, UTF-16, or UTF-32,
any code point with the
property as described by
UAX #31 of the Unicode standard is considered to be a whitespace character.
For a sequence of characters in neither UTF-8, UTF-16, or UTF-32,
the set of characters considered to be whitespace characters is unspecified.
The syntax of replacement fields is as follows:
{
arg-idopt scan-format-specifieropt }
0
positive-integer
positive-integer digit
nonnegative-integer digit
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
:
scan-format-spec scanner
specialization for the argument type; cannot start with }
scanner
specializations instead of formatter
specializations. The arg-id field specifies the index of the argument in
whose value is to be scanned from the input
instead of the replacement field. If there is no argument with the index arg-id in
,
the string is not a format string for
. The optional scan-format-specifier field explicitly specifies
a format for the scanned value.
[Example 1:
— end example]
If all arg-ids in a format string are omitted, argument indices 0, 1, 2, ... will automatically be used in that order. If some arg-ids are omitted and some are present, the string is not a format string. If there is any argument in args that doesn’t have a corresponding replacement field, or if there are multiple replacement fields corresponding to an argument in args, the string is not a format string for args.
[Note 1: A format string cannot contain a mixture of automatic and manual indexing. Every argument to be scanned must have one and exactly one corresponding replacement field in the format string. — end note]
The scan-format-spec field contains format specifications that define how the value should be scanned.
Each type can define its own interpretation of the scan-format-spec field.
If scan-format-spec does not conform to the format specifications for the argument type referred to by arg-id,
the string is not a format string for
.
[Example 2:
- For arithmetic, pointer, and string types the scan-format-spec is interpreted as a std-scan-format-spec as described in [scan.string.std].
- For user defined
specializations, the behavior of thescanner
member function determines how the scan-format-spec is interpreted.parse
7.2.2.2. Standard format specifiers [scan.string.std]
Each
specialization described in [scan.scanner.spec] for fundamental and string types interprets scan-format-spec and a std-scan-format-spec.
[Note 1: The format specification can be used to specify such details as minimum field width, alignment, and padding. Some of the formatting options are only supported for arithmetic types. — end note]
The syntax of format specifications is as follows:
L
opt scan-typeopt {
or }
<
>
^
.
nonnegative-integera
A
b
B
c
d
e
E
f
F
g
G
i
o
p
P
s
u
x
X
?
Field widths are specified in field width units (see [format.string.std]).
The fill character is the character denoted by the fill option or, if the fill option is absent, the space character. For a format specification in UTF-8, UTF-16, or UTF-32, the fill character corresponds to a single Unicode scalar value. Fill characters are always assumed to have a field width of one.
[Note 2: The presence of a fill option is signaled by the character following it, which must be one of the alignment options. If the second character of std-scan-format-spec is not a valid alignment option, then it is assumed that the fill and align options are both absent. — end note]
The align option applies to all argument types. The meaning of the various alignment options is as specified in [tab:scan.align].
Option | Meaning |
---|---|
| Skips fill characters after the scanned value, until either a non-fill character is encountered, or the maximum field width is reached.
If no align option is specified, but a scan-width or scan-precision is, this is the option used
for non-arithmetic non-pointer types, , and , unless an integer presentation type is specified.
|
| Skips fill characters before the scanned value, until either a non-fill character is encountered, or the maximum field width is reached.
If the maximum field width is reached by only reading fill characters, an error with the code is returned;
If no align option is specified, but a scan-width or scan-precision is, this is the option used
for arithmetic types other than and , pointer types, or when any integer presentation type is specified.
|
|
Skips fill characters both before and after the scanned value, until either a non-fill character is encountered, or the maximum field width is reached.
If the maximum field width is reached by only reading fill characters, an error with the code is returned;
[Note 3: The number of fill characters doesn’t have to be equal both before and after the value. — end note] |
The scan-width option specifies the minimum field width. If the scan-width option is absent, the minimum field width is
.
Otherwise, the value of the positive-integer is interpreted as a decimal integer and used as the value of the option.
If the number of characters consumed for scanning a value, including the value itself and fill characters used for alignment,
but excluding possibly skipped preceding whitespace is less than the minimum field width, an error with the code
is returned.
For the purposes of width computation, a string is assumed to be in a locale-independent, implementation-defined encoding.
Implementations should use either UTF-8, UTF-16, or UTF-32, on platforms capable of displaying Unicode text in a terminal.
It’s unclear if we can and/or should place a similar kind of normative recommendation here.
For a sequence of characters in UTF-8, UTF-16, or UTF-32, the algorithm for calculating field width is described in [format.string.std]. For a sequence of characters in neither UTF-8, UTF-16, or UTF-32, the field width is unspecified.
The scan-precision option specifies the maximum field width. If a maximum field width is specified, it’s the maximum number of characters read from the source range for any given scanning argument, including the value itself and any fill characters used for alignment, but excluding any possibly discarded preceding whitespace. Reaching the maximum field width is not an error.
When the
option is used, the form used for scanning is called the locale-specific form.
The
option is only valid for arithmetic types, and its effect depends upon the type.
-
For integral types, the locale-specific form causes digit group separator characters to be accepted. These digit group separator characters are ignored in parsing, and their form is determined by the context’s locale.
-
For floating-point types, the locale-specific form causes the radix separator character and digit group separator characters to be accepted, as determined by the context’s locale.
-
For the textual representation of
, the locale-specific form causes the accepted values to be determined as if bybool
andnumpunct :: truename
of the context’s locale.numpunct :: falsename
The scan-type determines how the data should be scanned. Unless otherwise specified, before scanning a value, all whitespace characters are read and discarded from the input, until encountering a character that is not a whitespace character.
If the value to be scanned is of type
,
and
is false
for a source range of type
,
the string is not a format string for
, when using
as the type of the source range.
The available string presentation types are specified in [tab:scan.type.string].
Type | Meaning |
---|---|
none,
| Copies characters from the input until a whitespace character is encountered. |
| Copies characters from the input until the maximum field width is reached.
Preceding whitespace is not skipped.
If no value is given for the scan-precision option, the string is not a format string for .
|
| Copies the escped string ([format.string.escaped]) from the input. |
The meaning of some non-string presentation types is defined in terms of a call to
.
In such cases, let
be a contiguous range of characters sourced from the input
and
be the scanning argument value.
Scanning is done as if by first copying characters from the input into
until the first character invalid for the presentation type is found,
after which
is called.
If
is an empty range, an error with the code
is returned.
[Note 4: Additional padding and adjustments are performed prior to calling
as specified by the format specifiers. — end note]
Integral types other than
and
are scanned as if by using an infinite precision integral type.
If its value cannot be represented in the integral type to be scanned,
an error with either the code
is returned if the value was positive,
and
if the value was negative.
If the presentation type allows it, integral types other than
and
can have a base prefix. This is not copied into range
.
The available integer presentation types for integral types other than
and
are specified in [tab:scan.type.int].
[Example 1:
auto r0 = scan < int > ( "42" , "{}" ); // Value of `r0->value()` is `42` auto r1 = scan < int , int , int > ( "42 42 42" , "{:d} {:o} {:x}" ); // Values of `r1->values()` are `42`, `042`, and `0x42` auto r2 = scan < int > ( "1,234" , "{:L}" ); // Value of `r2->value()` can be `1234` (depending on the locale)
— end example]
Type | Meaning |
---|---|
,
| ; the allowed base prefixes are and .
|
| Copies a value of type from the input. Preceding whitespace is not skipped.
|
| .
|
|
;
the value of is determined by the base prefix:
|
| ; the allowed base prefix is .
|
| The same as , except if the scanned value would be negative, an error with the code is returned.
|
,
| ; the allowed base prefixes are and .
|
none | The same as .
|
The available
presentation types are specified in [tab:scan.type.char].
Type | Meaning |
---|---|
none,
| Copies a value of type from the input. Preceding whitespace is not skipped.
|
, , , , , , ,
| As if by scanning an integer as specified in [tab:scan.type.int].
If the scanned value is negative, an error with the code is returned.
If the scanned value cannot be repsented in , an error with the code is returned.
|
| Copies the escped character ([format.string.escaped]) from the input. Preceding whitespace is not skipped. |
The available
presentation types are specified in [tab:scan.type.book].
Type | Meaning |
| Copies the textual representation, either true or false , from the input.
|
, , , , , , ,
| Copies the integral representation, either or , from the input.
|
none | Copies one of true , false , , or from the input.
|
Values of a floating-point type
are scanned as if by copying characters from the input into a contiguous range
represented by
. Let
represent the sign of the value.
-
If the first non-whitespace character is
,+
issign - value
, and this character is discarded,+ 1.0 -
if the first non-whitespace character is
,-
issign - value
, and this character is discarded,-1.0 -
otherwise,
issign - value
.+ 1.0
If the characters following the sign are
or
(case insensitive), the scanning is stopped,
and
is scanned.
If the characters following the sign are
or
,
where
is a sequence of alphanumeric characters and underscores (case insensitive), the scanning is stopped,
and
is scanned.
Otherwise, scanning is done as specified by the floating-point presentation type.
If the absolute value of the scanned value is larger than what can be represented by
,
a
with the following code is returned:
-
ifscan_error :: value_positive_overflow
issignbit ( sign - value ) false
, -
otherwise
.scan_error :: value_negative_overflow
If the absolute value of the scanned value is between zero and the smallest denormal value of
,
a
with the following code is returned:
-
ifscan_error :: value_positive_underflow
issignbit ( sign - value ) false
, -
otherwise
.scan_error :: value_negative_underflow
[Note 5: NaN payload is discarded. Scanning a literal
is not an overflow error. — end note]
The available floating-point presentation types and their meanings are specified in [tab:scan.type.float].
Type | Meaning |
---|---|
,
| followed by ,
except a prefix or is allowed and discarded.
|
,
| followed by .
|
,
| followed by .
|
,
| followed by .
|
none |
|
The available pointer presentation types are specified in [tab:scan.type.ptr].
Type | Meaning |
---|---|
none, ,
|
If is defined, equivalent to scanning a value of type with the scan-type,
followed by a to or ;
otherwise, implementation-defined.
[Note 6: No special null-value, apart from |
7.2.3. Error reporting [scan.err]
Scanning functions report errors using
([expected]).
Exceptions of a type publicly derived from
thrown from the
member function of a user defined specialization of
are caught by the library, and returned from a scanning function as a
with a code of
, and an unspecified message.
Recommended practice: Implementations should capture the message of the thrown exception,
and preserve it in the returned
.
[Note 1:
contains a message of type
,
and exceptions contain a message of type
, so propagating the message in a lifetime- and thread-safe
manner is not possible without using thread-local storage or a side-channel.
Use of TLS is possible because of the validity guarantees of
. — end note]
All other exceptions thrown by iterators and user defined specializations of
are propagated.
Failure to allocate storage is reported by throwing an exception as described in [res.on.exception.handling].
7.2.3.1. Class scan_error
[scan.error]
namespace std { class scan_error { enum code code_ ; // exposition only const char * message_ ; // exposition only public : enum code { end_of_input , invalid_format_string , invalid_scanned_value , invalid_literal , invalid_fill , length_too_short , value_positive_overflow , value_negative_overflow , value_positive_underflow , value_negative_underflow }; constexpr scan_error () noexcept ; constexpr scan_error ( enum code error_code , const char * message ); constexpr auto code () const noexcept -> enum code { return code_ ; } constexpr const char * msg () const ; }; }
The class
defines the type of objects used to represent errors returned from the scanning library.
It stores an error code, and a human-readable descriptive message.
constexpr scan_error ( code_type error_code , const char * message );
Preconditions:
is either a null pointer, or points to a NTCTS ([defns.ntcts]).
Postconditions:
.
constexpr const char * msg () const ;
Preconditions: No other scanning function has been called since the one that returned
.
Returns:
.
7.2.3.2. Class scan_format_string_error
[scan.format.error]
namespace std { class scan_format_string_error : public runtime_error { public : explicit scan_format_string_error ( const string & what_arg ); explicit scan_format_string_error ( const char * what_arg ); }; }
The class
defines the type of objects thrown as exceptions
to report errors in parsing format strings in the scanning library.
scan_format_string_error ( const string & what_arg );
Postconditions:
.
scan_format_string_error ( const char * what_arg );
Postconditions:
.
7.2.4. Result types [scan.result]
template < class Source , class ... Args > constexpr scan - result - type < Range , Args ... > make_scan_result ();
Effects: Equivalent to:
template < class Source , class ... Args > constexpr void fill_scan_result ( expected < Result , scan_error >& out , expected < Range , scan_error >&& in );
Constraints:
-
is a specialization ofResult
, andscan_result -
isstd :: is_same_v < typename Result :: range_type , Range > true
.
Effects:
-
If
isin . has_value () false
, assigns
tounexpected ( std :: move ( in . error ()))
,out -
if
isstd :: is_same_v < typename Result :: range_type , ranges :: dangling > false
, assigns
tostd :: move ( * in )
,out . range_ -
otherwise, does nothing.
7.2.4.1. Class template scan_result
[scan.result.result]
namespace std { template < class Range , class ... Args > class scan_result { using tuple_type = tuple < Args ... > range_type range_ ; // exposition only tuple < Args ... > values_ ; // exposition only inline constexpr bool is - dangling = is_same_v < Range , ranges :: dangling > ; // exposition only public : using range_type = Range ; using iterator = see below ; using sentinel = see below ; constexpr scan_result (); constexpr scan_result ( const scan_result & ) = default ; constexpr scan_result ( scan_result && ) = default ; constexpr scan_result ( Range r , tuple < Args ... >&& values ); template < class OtherR , class ... OtherArgs > constexpr explicit ( see below ) scan_result ( OtherR && r , tuple && values ); template < class OtherR , class ... OtherArgs > constexpr explicit ( see below ) scan_result ( const scan_result < OtherR , OtherArgs ... >& other ); template < class OtherR , class ... OtherArgs > constexpr explicit ( see below ) scan_result ( scan_result < OtherR , OtherArgs ... >&& other ); constexpr scan_result & operator = ( const scan_result & ) = default ; constexpr scan_result & operator = ( scan_result && ) noexcept ( see below ) = default ; template < class OtherR , class ... OtherArgs > constexpr scan_result & operator = ( const scan_result < OtherR , OtherArgs ... >& other ); template < class OtherR , class ... OtherArgs > constexpr scan_result & operator = ( scan_result < OtherR , OtherArgs ... >&& other ); constexpr range_type range () const { return range_ ; } constexpr iterator begin () const ; constexpr sentinel end () const ; template < class Self > constexpr auto && values ( this Self && ); template < class Self > constexpr auto && value ( this Self && ); }; }
An instance of
holds the scanned values and the remainder of the source range not used for scanning.
If a program declares an explicit or partial specialization of
, the program is ill-formed, no diagnostic required.
shall either be a specialization of
, or
.
shall be true
.
shall be true
.
If
is true
then the destructor of
is trivial.
using iterator = see below ; using sentinel = see below ;
The type
is:
-
If
isis - dangling false
,
,ranges :: iterator_t < Range > -
otherwise,
;ranges :: dangling
The type
is:
-
If
isis - dangling false
,
,ranges :: sentinel_t < Range > -
otherwise,
;ranges :: dangling
constexpr scan_result ();
Effects: Value-initializes
and
.
constexpr scan_result ( const scan_result & rhs ) = default ;
Mandates:
-
isis_copy_constructible_v < Range > true
, and -
isis_copy_constructible_v < tuple_type > true
.
Effects: Direct-non-list-initializes
with
, and
with
.
constexpr scan_result ( scan_result && rhs ) = default ;
Constraints:
-
isis_move_constructible_v < Range > true
, and -
isis_move_constructible_v < tuple_type > true
.
Effects: Direct-non-list-initializes
with
, and
with
.
constexpr scan_result ( Range r , tuple < Args ... >&& values );
Effects: Direct-non-list-initializes
with
, and
with
.
template < class OtherR , class ... OtherArgs > constexpr explicit ( see below ) scan_result ( OtherR && r , tuple && values );
Constraints:
-
isis_constructible_v < Range , OtherR > true
, and -
isis_constructible_v < tuple < Args ... > , tuple < OtherArgs ... >> true
.
Effects: Direct-non-list-initializes
with
, and
with
.
Remarks: The expression inside
is equivalent to:
.
template < class OtherR , class ... OtherArgs > constexpr explicit ( see below ) scan_result ( const scan_result < OtherR , OtherArgs ... >& other );
Constraints:
-
isis_constructible_v < Range , const OtherR &> true
, and -
isis_constructible_v < tuple < Args ... > , const tuple < OtherArgs ... >&> true
.
Effects: Direct-non-list-initializes
with
, and
with
.
Remarks: The expression inside
is equivalent to:
.
template < class OtherR , class ... OtherArgs > constexpr explicit ( see below ) scan_result ( scan_result < OtherR , OtherArgs ... >&& other );
Constraints:
-
isis_constructible_v < Range , OtherR > true
, and -
isis_constructible_v < tuple < Args ... > , tuple < OtherArgs ... >> true
.
Effects: Direct-non-list-initializes
with
, and
with
.
Remarks: The expression inside
is equivalent to:
.
constexpr scan_result & operator = ( const scan_result & rhs ) = default ;
Effects: Assigns
to
, and
to
.
Returns:
.
Remarks: This operator is defined as deleted unless
is true
.
constexpr scan_result & operator = ( scan_result && rhs ) noexcept ( see below ) = default ;
Constraints:
is true
.
Effects: Assigns
to
, and
to
.
Returns:
.
Remarks: The exception specification is equivalent to
.
template < class OtherR , class ... OtherArgs > constexpr scan_result & operator = ( const scan_result < OtherR , OtherArgs ... >& rhs );
Constraints:
-
isis_assignable_v < Range & , const OtherR &> true
, and -
isis_assignable_v < tuple < Args ... >& , const tuple < OtherArgs ... >&> true
Effects: Assigns
to
, and
to
.
Returns:
.
template < class OtherR , class ... OtherArgs > constexpr scan_result & operator = ( scan_result < OtherR , OtherArgs ... >&& rhs );
Constraints:
-
isis_assignable_v < Range & , OtherR > true
, and -
isis_assignable_v < tuple < Args ... >& , tuple < OtherArgs ... >> true
Effects: Assigns
to
, and
to
.
Returns:
.
constexpr iterator begin () const ;
Returns:
-
If
isis - dangling false
,
,ranges :: begin ( range_ ) -
otherwise, a value-initialized object of type
.ranges :: dangling
constexpr sentinel end () const ;
Returns:
-
If
isis - dangling false
,
,ranges :: end ( range_ ) -
otherwise, a value-initialized object of type
.ranges :: dangling
template < class Self > constexpr auto && values ( this Self && self );
Returns:
.
template < class Self > constexpr auto && value ( this Self && self );
Constraints:
is
.
Returns:
.
7.2.5. Class template basic_scan_format_string
[scan.fmt.string]
namespace std { template < class charT , class Range , class ... Args > struct basic_scan_format_string { private : basic_string_view < charT > str ; // exposition only public : template < class T > consteval basic_scan_format_string ( const T & s ); basic_scan_format_string ( runtime - format - string < charT > s ) noexcept : str ( s . str ) {} constexpr basic_string_view < charT > get () const noexcept { return str ; } }; }
Constraints:
models
.
Effects: Direct-non-list-initializes
with
.
Remarks: A call to this function is not a core constant expression ([expr.const])
unless there exist
of types
such that
is a format string for
.
7.2.6. Scanning functions [scan.functions]
template < class ... Args , scannable - range < char > Range > scan - result - type < Range , Args ... > scan ( Range && range , scan_format_string < Range , Args ... > fmt );
Effects: Let
be a value-initialized object of type
.
Creates an object
and initializes it with
.
-
If
isr . has_value () true
, sets
toresult . range_
,* r -
otherwise, assigns
tounexpected ( r . error ())
.result
Returns:
.
template < class ... Args , scannable - range < wchar_t > Range > scan - result - type < Range , Args ... > scan ( Range && range , wscan_format_string < Range , Args ... > fmt );
Effects: Let
be a value-initialized object of type
.
Creates an object
and initializes it with
.
-
If
isr . has_value () true
, sets
toresult . range_
,* r -
otherwise, assigns
tounexpected ( r . error ())
.result
Returns:
.
template < class ... Args , scannable - range < char > Range > scan - result - type < Range , Args ... > scan ( const locale & loc , Range && range , scan_format_string < Range , Args ... > fmt );
Effects: Let
be a value-initialized object of type
.
Creates an object
and initializes it with
.
-
If
isr . has_value () true
, sets
toresult . range_
,* r -
otherwise, assigns
tounexpected ( r . error ())
.result
Returns:
.
template < class ... Args , scannable - range < wchar_t > Range > scan - result - type < Range , Args ... > scan ( const locale & loc , Range && range , wscan_format_string < Range , Args ... > fmt );
Effects: Let
be a value-initialized object of type
.
Creates an object
and initializes it with
.
-
If
isr . has_value () true
, sets
toresult . range_
,* r -
otherwise, assigns
tounexpected ( r . error ())
.result
Returns:
.
template < scannable - range < char > Range > vscan - result - type < Range > vscan ( Range && range , string_view fmt , scan_args args ); template < scannable - range < wchar_t > Range > vscan - result - type < Range > vscan ( Range && range , wstring_view fmt , wscan_args args ); template < scannable - range < char > Range > vscan - result - type < Range > vscan ( const locale & loc , Range && range , string_view fmt , scan_args args ); template < scannable - range < wchar_t > Range > vscan - result - type < Range > vscan ( const locale & loc , Range && range , wstring_view fmt , wscan_args args );
Effects: Scans
for the character representations of scanning arguments provided by
scanned according to specifications given in
.
If present,
is used for locale-specific formatting.
If successful, returns a
constructed from
and
,
where
is an iterator pointing to the first character that was not scanned in
.
Otherwise, returns a
describing the error.
Throws: As specified in [scan.err].
Remarks: If
is a reference to an array of
,
is treated as a NTCTS ([defns.ntcts]).
7.2.7. Scanner [scan.scanner]
7.2.7.1. Scanner requirements [scan.scanner.requirements]
A type
meets the Scanner requirements if it meets the
-
Cpp17DefaultConstructible,
-
Cpp17CopyConstructible,
-
Cpp17CopyAssignable,
-
Cpp17Swappable, and
-
Cpp17Destructible,
requirements, and the expressions shown in [tab:scan.scanner] are valid and have the indicated semantics.
Given character type
, source range type
, and scanning argument type
,
in [tab:scan.scanner]:
-
is a value of type (possibly const)s
,S -
is an lvalue of typels
,S -
is an lvalue of typet
,T -
isPC
,basic_scan_parse_context < charT > -
isSC
,basic_scan_context < Range , charT > -
is an lvalue of typepc
, andPC -
is an lvalue of typesc
.FC
points to the beginning of the scan-format-spec ([scan.string]) of the replacement field
being scanned in the format string. If scan-format-spec is not present or empty then either
or
.
Expression | Return type | Requirement |
---|---|---|
|
| Parses scan-format-spec ([scan.string]) for type in the range until the first unmatched charactter. Throws unless the whole range is parsed or the unmatched character is .
Stores the parsed format specifiers in and returns an iterator past the end of the parsed range.
|
|
| Scans from according to the specifiers stored in .
Reads the input from or , and writes the result in .
On success, returns an iterator past the end of the last scanned character from ,
otherwise returns an object of type .
The value of after calling shall only depend on , ,
and the range from the last call to .
|
7.2.7.2. Concept scannable
[scan.scannable]
namespace std { template < class T , class Context , class Scanner = typename Context :: template scanner_type < T >> concept scannable - with = // exposition only semiregular < Scanner > && requires ( Scanner & s , const Scanner & cs , T & t , Context & ctx , basic_scan_parse_context < typename Context :: char_type >& pctx ) { { s . parse ( pctx ) } -> same_as < typename decltype ( pctx ) :: iterator > ; { cs . scan ( t , ctx ) } -> same_as < expected < typename Context :: iterator , scan_error >> ; }; template < class T , class charT > concept scannable = scannable - with < T , basic_scan_context < unspecified , charT >> ; }
A type
and a character type
model
if
meets the Scanner requirements ([scan.scanner.requirements]).
[Note 1:
is true
, even though a
can only be
scanned from a contiguous borrowed range. — end note]
7.2.7.3. Scanner specializations
The functions defined in [scan.functions] use specializations of the class template
to scan individual arguments.
Let
be either
or
.
Each specialization of
is either enabled or disabled, as described below.
A debug-enabled specialization of
additionally provides a public, constexpr,
non-static member function
which modifies the state of the
to be
as if the type of the std-scan-format-spec parsed by the last call to
were
.
Each header that declares the template
provides the following enabled specializations:
-
The debug-enabled specializations
template <> struct scanner < char , char > ; template <> struct scanner < wchar_t , wchar_t > ;
-
For each
, the debug-enabled string type specializationscharT
template < class Allocator > struct scanner < basic_string < charT , char_traits < charT > , Allocator > , charT > ; template <> struct scanner < basic_string_view < charT > , charT > ;
-
For each
, for each arithmetic typecharT
other thanArithmeticT
,char
,wchar_t
,char8_t
, orchar16_t
, a specializationchar32_t
template <> struct scanner < ArithmeticT , charT > ;
-
For each
, the pointer type specializationscharT
template <> struct scanner < void * , charT > ; template <> struct scanner < const void * , charT > ;
The
member functions of these scanners interpret the format specification as a std-scan-format-spec as described in [scan.string.std].
For any types
and
for which neither the library nor the user provides an explicit or partial specialization
of the class template
,
is disabled.
If the library provides an explicit or partial specialization of
,
that specialization is enabled and meets the Scanner requirements except as noted otherwise.
If
is a disabled specialization of
, these values are false
:
-
,is_default_constructible_v < S > -
,is_copy_constructible_v < S > -
,is_move_constructible_v < S > -
, andis_copy_assignable_v < S > -
.is_move_assignable_v < S >
An enabled specialization of
meets the Scanner requirements ([scan.scanner.requirements]).
7.2.7.4. Class template basic_scan_parse_context
[scan.parse.ctx]
namespace std { template < class charT > ; class basic_scan_parse_context { public : using char_type = charT ; using const_iterator = typename basic_string_view < charT >:: const_iterator ; using iterator = const_iterator ; private : iterator begin_ ; // exposition only iterator end_ ; // exposition only enum indexing { unknown , manual , automatic }; // exposition only indexing indexing_ ; // exposition only size_t next_arg_id_ ; // exposition only size_t num_args_ ; // exposition only public : constexpr explicit basic_scan_parse_context ( basic_string_view < charT > fmt ) noexcept ; basic_scan_parse_context ( const basic_scan_parse_context & ) = delete ; basic_scan_parse_context & operator = ( const basic_scan_parse_context & ) = delete ; constexpr const_iterator begin () const noexcept { return begin_ ; } constexpr const_iterator end () const noexcept { return end_ ; } constexpr void advance_to ( const_iterator it ); constexpr size_t next_arg_id (); constexpr void check_arg_id ( size_t id ); }; }
An instance of
holds the format string parsing state,
consisting of the format string range being parsed and the argument counter for automatic indexing.
If a program declares an explicit or partial specialization of
,
the program is ill-formed, no diagnostic required.
constexpr explicit basic_scan_parse_context ( basic_string_view < charT > fmt ) noexcept ;
Effects: Initializes
with
,
with
,
with
,
with
, and
with
.
[Note 1: Any call to
or
on an instance of
initialized using this constructor is not a core constant expression. — end note]
constexpr void advance_to ( const_iterator it );
Preconditions:
is reachable from
.
Effects: Equivalent to:
.
constexpr size_t next_arg_id ();
Effects: If
is true
, equivalent to:
if ( indexing_ == unknown ) indexing_ = automatic ; return next_arg_id_ ++ ;
Otherwise, the string is not a format string for
.
Remarks: Let
be the value of
prior to this call.
Call expressions where
is false
are not core constant expressions ([expr.const]).
constexpr void check_arg_id ( size_t id );
Effects: If
is true
, equivalent to:
if ( indexing_ == unknown ) indexing_ = manual ;
Otherwise, the string is not a format string for
.
Remarks: A call to this function is a core constant expression ([expr.const]) only if
is true
.
7.2.8. Class template basic_scan_context
[scan.context]
namespace std { template < class Range , class charT > class basic_scan_context { iterator current_ ; // exposition only sentinel end_ ; // exposition only basic_scan_args < basic_scan_context > args_ ; // exposition only public : using char_type = charT ; using range_type = Range ; using iterator = ranges :: iterator_t < range_type > ; using sentinel = ranges :: sentinel_t < range_type > ; template < class T > using scanner_type = scanner < T , char_type > ; basic_scan_arg < basic_scan_context > arg ( size_t id ) const noexcept ; std :: locale locale (); iterator begin () const { return begin_ ; } sentinel end () const { return end_ ; } ranges :: subrange < iterator , sentinel > range () const ; void advance_to ( iterator it ); }; }
An instance of
holds scanning state consisting of the scanning arguments
and the source range.
If a program declares an explicit or partial specialization of
,
the program is ill-formed, no diagnostic required.
shall model
, and its value type shall be
.
The iterator and sentinel types of
shall model
.
is an alias for a specialization of
with a range type
that can contain a reference to any other forward range with a value type of
.
Similarly,
is an alias for a specialization of
with a range type that can contain a reference to any other forward range with a value type of
.
Recommended practice: For a given type charT,
implementations should provide a single instantiation for reading from
,
, or any other container with contiguous storage
by wrapping those in temporary objects with a uniform interface, such as a
.
basic_scan_arg < basic_scan_context > arg ( size_t id ) const noexcept ;
Returns:
.
std :: locale locale ();
Returns: The locale passed to the scanning function if the latter takes one, and
otherwise.
ranges :: subrange < iterator , sentinel > range () const ;
Effects: Equivalent to:
void advance_to ( iterator it ) const ;
Effects: Equivalent to:
7.2.9. Arguments [scan.arguments]
7.2.9.1. Class template basic_scan_arg
[scan.arg]
namespace std { template < class Context > class basic_scan_arg { public : class handle ; private : using char - type = typename Context :: char_type ; // exposition only variant < monostate , signed char * , short * , int * , long * , long long * , unsigned char * , unsigned short * , unsigned int * , unsigned long * , unsigned long long * , bool * , char - type * , void ** , const void ** , float * , double * , long double * , basic_string < char - type >* , basic_string_view < char - type >* , handle > value ; // exposition only template < class T > explicit basic_scan_arg ( T & v ) noexcept ; // exposition only public : basic_scan_arg () noexcept ; explicit operator bool () const noexcept ; template < class Visitor > decltype ( auto ) visit ( this basic_scan_arg arg , Visitor && vis ); template < class R , class Visitor > R visit ( this basic_scan_arg arg , Visitor && vis ); }; }
An instance of
provides access to a scanning argument for user-defined scanners.
The behavior of a program that adds specializations of
is undefined.
basic_scan_arg () noexcept ;
Postconditions:
.
template < class T > explicit basic_scan_arg ( T & v ) noexcept ;
Constraints:
satisfies
.
Effects: Let
be
.
-
If
is a standard signed integer type ([basic.fundamental]), a standard unsigned integer type,TD
,bool
,char - type
, a standard floating-point type,void *
, orbasic_string < char - type >
, initializesbasic_string_view < char - type >
withvalue
;addressof ( v ) -
otherwise, initializes
withvalue
.handle ( v )
explicit operator bool () const noexcept ;
Returns:
.
template < class Visitor > decltype ( auto ) visit ( this basic_scan_arg arg , Visitor && vis );
Effects: Equivalent to:
template < class R , class Visitor > R visit ( this basic_scan_arg arg , Visitor && vis );
Effects: Equivalent to:
The class
allows scanning an object of a user-defined type.
namespace std { template < class Context > class basic_scan_arg < Context >:: handle { void * ptr_ ; // exposition only expected < void , scan_error > ( * scan_ ) ( basic_scan_parse_context < char_type > , Context & , void * ); // exposition only template < class T > explicit handle ( T & val ) noexcept ; // exposition only friend class basic_scan_arg < Context > ; // exposition only public : expected < void , scan_error > scan ( basic_scan_parse_context < char_type >& parse_ctx , Context & ctx ) const ; }; }
template < class T > explicit handle ( T & val ) noexcept ;
Mandates:
satisfies
.
Effects: Initializes
with
and
with
[]( basic_scan_parse_context < char_type >& parse_ctx , Context & scan_ctx , void * ptr ) -> expected < void , scan_error > { typename Context :: template scanner_type < T > s ; auto p = do - parse ( s , parse_ctx ); if ( ! p ) return unexpected ( p . error ()); parse_ctx . advance_to ( * p ); auto r = s . scan ( * static_cast < T *> ( ptr ), scan_ctx ); if ( ! r ) return unexpected ( r . error ()); scan_ctx . advance_to ( * r ); return {}; }
where
:
-
has a return type of
,expected < basic_scan_parse_context < char_type >:: iterator , scan_error > -
calls
,s . parse ( pc ) -
catches exceptions derived from
thrown byscan_format_string_error
. If such an exception is caught, returns as . parse
with a code ofscan_error
.invalid_format_string -
Otherwise, returns the iterator returned by
.s . parse
expected < void , scan_error > scan ( basic_scan_parse_context < char_type >& parse_ctx , Context & scan_ctx ) const ;
Effects: Equivalent to:
7.2.9.2. Class template scan - arg - store
[scan.arg.store]
namespace std { template < class Context , class ... Args > class scan - arg - store { // exposition only array < basic_scan_arg < Context > , sizeof ...( Args ) > args ; // exposition only }; }
An instance of
stores scanning arguments.
template < class Context = scan_context , class ... Args > constexpr scan - arg - store < Context , Args ... > make_scan_args ( std :: tuple & values );
Preconditions: The type
meets the Scanner requirements ([scan.scanner.requirements]) for each
in
.
Returns: An object of type
.
All elements of the
member of the returned object are initialized with
,
where
is an index in the range of
.
template < class ... Args > constexpr scan - arg - store < wscan_context , Args ... > make_wscan_args ( std :: tuple & values );
Effects: Equivalent to:
.
7.2.9.3. Class template basic_scan_args
[scan.args]
namespace std { template < class Context > class basic_scan_args { size_t size_ ; // exposition only const basic_scan_arg < Context >* data_ ; // exposition only public : basic_scan_args () noexcept ; template < class ... Args > basic_scan_args ( const scan - arg - store < Context , Args ... >& store ) noexcept ; basic_scan_arg < Context > get ( size_t i ) noexcept ; }; template < class Context , class ... Args > basic_scan_args ( scan - arg - store < Context , Args ... > ) -> basic_scan_args < Context > ; }
An instance of
provides access to scanning arguments.
Implementations should optimize the representation of
for a small number of scanning arguments.
[Note 1: For example, by storing indices of type alternatives separately from values and packing the former. — end note]
template < class ... Args > basic_scan_args ( const scan - arg - store < Context , Args ... >& store ) noexcept ;
Effects:Initializes
with
and
with
;
basic_scan_arg < Context > get ( size_t i ) noexcept ;
Returns:
.