<div dir="ltr"><div dir="ltr">On Sat, Sep 7, 2019 at 5:13 PM Tom Honermann via Lib <<a href="mailto:lib@lists.isocpp.org">lib@lists.isocpp.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p><a href="http://eel.is/c++draft/format#string.std-7" target="_blank">[format.string.std]p7</a>
states:</p>
<p>
</p><blockquote type="cite">
<p>The <i>positive-integer</i> in <i>width</i> is a decimal
integer defining the minimum field width. If <i>width</i> is
not specified, there is no minimum field width, and the field
width is determined based on the content of the field.</p>
</blockquote>
<p></p>
<p>Is field width measured in code units, code points, or something
else?</p>
<p>Consider the following example assuming a UTF-8 locale:<br>
</p>
<p><tt>std::format("{}", "\xC3\x81"); // U+00C1</tt><tt>
{ </tt><tt>LATIN CAPITAL LETTER A WITH ACUTE }</tt><br>
<tt>std::format("{}", "\x41\xCC\x81"); // U+0041 U+0301 { </tt><tt>LATIN
CAPITAL LETTER A } { </tt><tt>COMBINING ACUTE ACCENT }<br>
</tt></p>
<p>In both cases, the arguments encode the same user-perceived
character (Á). The first uses two UTF-8 code units to encode a
single code point that represents a single glyph using a composed
Unicode normalization form. The second uses three code units to
encode two code points that represent the same glyph using a
decomposed Unicode normalization form.</p>
<p>How is the field width determined? If measured in code units,
the first has a width of 2 and the second of 3. If measured in
code points, the first has a width of 1 and the second of 2. If
measured in grapheme clusters, both have a width of 1. Is the
determination locale dependent?</p></div><br></blockquote><div><br></div><div>(Coming late to the party)</div><div>Let's ask a different question.</div><div><br></div><div> std::string s = "/* some content */";<br>         std::ostringstream oss;<br>         oss << std::setw(22) << s;<br>         std::string result1 = oss.str();<br>         std::string result2 = std::format("{:22}", s);<br></div><div><br></div><div>What can we say about the contents of "result1" and "result2"?</div><div>Are they the same? Does it matter what the contents of `s` is?</div><div><br></div><div>-- Marshall</div></div></div>