1. Introduction
This paper proposes making
formattable using the formatting
facility introduced in C++20 (
).
2. Motivation
has a rudimentary
inserter. For example:
std :: error_code ec ; auto size = std :: filesystem :: file_size ( "nonexistent" , ec ); std :: cout << ec ;
This works and prints
.
However, the following code doesn’t compile:
std :: ( "{} \n " , ec );
Unfortunately, the existing inserter has several issues, such as I/O manipulators applying only to the category name rather than the entire error code, resulting in confusing output:
std :: cout << std :: left << std :: setw ( 12 ) << ec ;
This prints:
generic :2
Additionally, it doesn’t allow formatting the error message and introduces potential encoding issues, as the encoding of the category name is unspecified.
3. Proposal
This paper proposes adding a
specialization for
to address the problems discussed in the previous section.
The default format will produce the same output as the
inserter:
std :: ( "{} \n " , ec );
Output:
generic:2
It will correctly handle width and alignment:
std :: ( "[{:>12}] \n " , ec );
Output:
[ generic:2]
Additionally, it will allow formatting the error message:
std :: ( "{:s} \n " , ec );
Output:
No such file or directory
(The actual message depends on the platform.)
The main challenge lies in the standard’s lack of specification for the
encodings of strings returned by
and
/
(syserr.errcat.virtuals):
virtual const char * name () const noexcept = 0 ;
Returns: A string naming the error category.
virtual string message ( int ev ) const = 0 ;
Returns: A string that describes the error condition denoted by
.
In practice, implementations typically define category names as string literals, meaning they are in the ordinary literal encoding.
However, there is significant divergence in message encodings. libc++ and
libstdc++ use
for the generic category which is in the C
(not "C") locale encoding but disagree on the encoding for the system category:
libstdc++ uses the Active Code Page (ACP) while libc++ again uses
/ C locale on Windows. Microsoft STL uses a table of string literals in the
ordinary literal encoding for the generic category and ACP for the system
category.
The following table summarizes the differences:
libstdc++ | libc++ | Microsoft STL | |
POSIX |
|
| N/A |
Windows | / ACP
|
| ordinary literals / ACP |
Obviously none of this is usable in a portable way through the generic
API because encodings can be and often are different.
To address this, the proposal suggests using the C locale encoding (execution
character set), which is already employed in most cases and aligns with
underlying system APIs. Microsoft STL’s implementation has a number of bugs in
([MSSTL-3254], [MSSTL-4711]) and will
likely need to change anyway. This also resolves [LWG4156].
An alternative approach could involve communicating the encoding from
. However, this introduces ABI challenges and complicates usage
compared to adopting a single encoding.
4. Wording
Add to "Header <system_error> synopsis" [system.error.syn]:
// [system.error.fmt], formatter template < class charT > struct formatter < error_code , charT > ;
Add a new section "Formatting" [system.error.fmt] under "Class
" [syserr.errcode]:
template < class charT > struct formatter < error_code , charT > { constexpr typename basic_format_parse_context < charT >:: iterator parse ( basic_format_parse_context < charT >& ctx ); template < class FormatContext > typename FormatContext :: iterator format ( const error_code & ec , FormatContext & ctx ) const ; };
constexpr typename basic_format_parse_context < charT >:: iterator parse ( basic_format_parse_context < charT >& ctx );
Effects: Parses the format specifier as a error-code-format-spec and stores the
parsed specifiers in
.
error-code-format-spec:
fill-and-alignopt widthopt
opt
where the productions fill-and-align and width are described in [format.string].
Returns: An iterator past the end of the error-code-format-spec.
template < class FormatContext > typename FormatContext :: iterator format ( const error_code & ec , FormatContext & ctx ) const ;
Effects: Let
be
transcoded into the ordinary literal
encoding, with maximal subparts of ill-formed subsequences substituted with
U+FFFD REPLACEMENT CHARACTER per the Unicode Standard, Chapter 3.9 U+FFFD
Substitution in Conversion, if the
option is used, otherwise
.
Writes
into
, adjusted according to the error-code-format-spec.
Returns: An iterator past the end of the output range.
Modify [syserr.errcat.virtuals]:
...
virtual string message ( int ev ) const = 0 ;
Returns: A string
of multibyte characters in the executon character
set
that describes the error condition denoted by
.
5. Implementation
The proposed
for
has been implemented in the
open-source {fmt} library ([FMT]).