1. Introduction
To prevent mojibake
may use a native Unicode API when writing to
a terminal bypassing the stream buffer. During the review of [P2093] "Formatted output" Tim Song suggested that synchronizing
with the
underlying stream may be beneficial for gradual adoption. This paper presents
motivating examples, observes that this problem doesn’t normally happen in
practice and proposes a minor update to the wording to provide a synchronization
guarantee.
2. Revision history
Changes since R2:
-
Replaced "If the native Unicode API is used and
referring to a terminal is buffered by default on the current system, the function flushes thestdout
's buffer before writingstream
." with "If the native Unicode API is used, the function flushes theout
's buffer before writingstream
." to provide a stronger guarantee per LEWG feedback.out -
Replaced "If
referring to a terminal is buffered by default on the current system, the function flushes thestdout
's buffer before writingos
." with "If the native Unicode API is used, the function flushes theout
's buffer before writingos
." to provide a stronger guarantee per LEWG feedback.out
Changes since R1:
-
Added LEWG poll results.
-
Replaced "the terminal output is buffered by default" with a more specific "
referring to a terminal is buffered by default on the current system".stdout
Changes since R0:
-
Added another motivating example.
-
Split Discussion into multiple sections.
-
Added the wording.
3. LEWG Poll (R1)
Poll: Send P2539R1 Should The Output Of print To A Terminal Be Synchronized With The Underlying Stream? to Library Working Group for C++23, classified as an addition (P0592R4 bucket 3 item).
SF | F | N | A | SA |
---|---|---|---|---|
11 | 10 | 2 | 1 | 0 |
Outcome: Consensus in favor
4. Motivating examples
Consider the following example:
printf ( "first \n " ); std :: ( "second \n " );
This will produce the expected output:
first second
because
is at least line buffered by default.
However, in theory this may reorder the output:
printf ( "first" ); std :: ( "second" );
because of buffering in
but not
. Testing on Windows 10
with MSVC 19.28 and {fmt}'s implementation of
([FMT]) showed that the
order is preserved in this case as well. This suggests that
is
completely unbuffered by default on this system. This is also confirmed in [MS-CRT]:
The
and
stdout functions are flushed whenever they are full or, if you are writing to a character device, after each library call.
stderr
On other systems the order is preserved too because the output goes through the stream buffer in both cases.
Consider, another example that involves iostreams:
struct A { int a ; int b ; friend std :: ostream & operator << ( std :: ostream & os , const A & a ) { std :: ( os , "{{a={}, b={}}}" , a . a , a . b ); return os ; } }; int main () { A a = { 2 , 4 }; std :: cout << "A is " << a << '\n' ; }
We updated the implementation of
for
in {fmt} to use the
native Unicode API and verified that there is no reordering in this example
either on the same test platform.
5. Proposal
Although the issue appears to be mostly theoretical, it might still be
beneficial to clarify in the standard that synchronization is desired.
It is possible to guarantee the desired output ordering by flushing the buffer
before writing to a terminal in
.
This will incur additional cost but only for the terminal case and when
transcoding is needed. Platforms that don’t buffer the output like the one we
tested should be able to avoid a call to flush.
Neither {fmt} ([FMT]) nor Rust ([RUST-STDIO]) do any attempt to provide such
synchronization in their implementations of
. However, in practice this
synchronization appears to be a noop on tested platforms.
6. Wording
Modify subsection "Print functions [print.fun]":
void vprint_unicode ( FILE * stream , string_view fmt , format_args args );
...
Effects: The function initializes an automatic variable via
string out = vformat ( fmt , args );
If
refers to a terminal capable of displaying Unicode, writes
to
the terminal using the native Unicode API; if
contains invalid code units,
the behavior is undefined and implementations are encouraged to diagnose it.
Otherwise writes
to
unchanged.
stream
's buffer
before writing out
.
Modify subsection "Print [ostream.formatted.print]":
void vprint_unicode ( ostream & os , string_view fmt , format_args args ); void vprint_nonunicode ( ostream & os , string_view fmt , format_args args );
Effects: Behaves as a formatted output function
([ostream.formatted.reqmts])
of
, except that:
- failure to generate output is reported as specified below, and
- any exception thrown by the call to
is propagated without regard to the value ofvformat
and without turning onos . exceptions ()
in the error state ofios_base :: badbit
.os
After constructing a
object, the
function initializes an automatic variable via
string out = vformat ( os . getloc (), fmt , args );
If the function is
and
is a stream that refers to a
terminal capable of displaying Unicode which is determined in an
implementation-defined manner, writes
to the terminal using the
native Unicode API; if
contains invalid code units, the behavior
is undefined and implementations are encouraged to diagnose it.
If the native Unicode API is used, the function flushes the
's buffer before
writing
.
Otherwise (if
is not such a stream or the function is
), inserts the character sequence
into
.
If writing to the terminal or inserting into
fails, calls
(which may throw
).