<div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Nov 12, 2019, 22:09 Tom Honermann <<a href="mailto:tom@honermann.net">tom@honermann.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>If implementors aren't going to be
willing to change these tables once we ship, then I think we have
a fairly serious issue.</div>
<div></div></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto">+1</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><div><br>
</div>
<div>Some have adamantly stated that these
widths are estimates only and should not be counted on to remain
stable. Code that is sensitive to the formatted size of the
output should be calling std::formatted_size and allocating
appropriately. I take it your concern is regarding code that
calls std::format_to with an assumption that the provided output
buffer is large enough? (or, code that calls std::format and
assumes the size of the resulting std::string).<br>
</div>
<div><br>
</div>
<div>Tom.<br>
</div>
<div><br>
</div>
<div>On 11/12/19 8:58 PM, Billy O'Neal (VC
LIBS) wrote:<br>
</div>
<blockquote type="cite">
<div>
<p class="MsoNormal">My only point was that the specified
behavior gives grapheme clusters a width of 1 or 2, but there
exist characters like U+FDFD that are wider than 2. (And many
that have a width of 0) I would be very nervous about changing
the constants used after std::format ships because that could
introduce unexpected buffer overruns or underruns in user
programs. This is the kind of thing that becomes contractual
very quickly (which is one of the reasons I was weakly against
trying to open this can of worms).</p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Billy3</p>
<p class="MsoNormal"><u></u> <u></u></p>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="border:none;padding:0in"><b>From:
</b><a href="mailto:tom@honermann.net" target="_blank" rel="noreferrer">Tom Honermann</a><br>
<b>Sent: </b>Tuesday, November 12, 2019 12:53 PM<br>
<b>To: </b><a href="mailto:lib-ext@lists.isocpp.org" target="_blank" rel="noreferrer">lib-ext@lists.isocpp.org</a>;
<a href="mailto:corentin.jabot@gmail.com" target="_blank" rel="noreferrer">Corentin</a><br>
<b>Cc: </b><a href="mailto:bion@microsoft.com" target="_blank" rel="noreferrer">Billy O'Neal (VC LIBS)</a>; <a href="mailto:lib@lists.isocpp.org" target="_blank" rel="noreferrer">
lib@lists.isocpp.org</a>; <a href="mailto:unicode@open-std.org" target="_blank" rel="noreferrer">SG16</a>;
<a href="mailto:victor.zverovich@gmail.com" target="_blank" rel="noreferrer">
Victor Zverovich</a><br>
<b>Subject: </b>Re: [isocpp-lib-ext] The "Let's Stop
Ascribing Meaning to Code Points" blog post</p>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><span style="color:black">On 11/12/19 6:11
PM, Billy O'Neal (VC LIBS) via Lib-Ext wrote:<u></u><u></u></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="color:black">It came up in
the context of that width thing in format and I was asking
if I had permission to make wider-than-2 characters format
properly, and the forwarded text doesn’t seem to allow
that (which is OK, I just wanted to understand at the
time); I was thinking of U+FDFD (﷽).<u></u><u></u></span></p>
</blockquote>
<p>Can you elaborate? My understanding of the forwarded wording
is that the assumed encoding for the input text is
implementation defined (though not locale sensitive) and that
implementors are encouraged to use the Unicode code point
ranges indicated in the wording, but are not required to (that
is my interpretation of the use of the word "should" in the
proposed wording).</p>
<p>It does look like the provided code point ranges don't handle
U+FDFD correctly.</p>
<p>I don't know how much confidence should be placed on the
listed code point ranges. But I think it is important that we
consider them amenable to change. I suspect that U+FDFD is
not the last code point we'll find that is not correctly
handled.</p>
<p>Tom.</p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black">Billy3<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="color:black">From: </span></b><span style="color:black"><a href="mailto:corentin.jabot@gmail.com" target="_blank" rel="noreferrer">Corentin</a><br>
<b>Sent: </b>Tuesday, November 12, 2019 8:42 AM<br>
<b>To: </b><a href="mailto:lib-ext@lists.isocpp.org" target="_blank" rel="noreferrer">C++ Library Evolution Working
Group</a><br>
<b>Cc: </b><a href="mailto:lib@lists.isocpp.org" target="_blank" rel="noreferrer">lib@lists.isocpp.org</a>; <a href="mailto:bion@microsoft.com" target="_blank" rel="noreferrer">
Billy O'Neal (VC LIBS)</a>; <a href="mailto:unicode@open-std.org" target="_blank" rel="noreferrer">SG16</a><br>
<b>Subject: </b>Re: [isocpp-lib-ext] The "Let's Stop
Ascribing Meaning to Code Points" blog post<u></u><u></u></span></p>
</div>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
<div>
<div>
<p class="MsoNormal"><span style="color:black">On Tue, 12
Nov 2019 at 16:58, Billy O'Neal (VC LIBS) via Lib-Ext
<<a href="mailto:lib-ext@lists.isocpp.org" target="_blank" rel="noreferrer">lib-ext@lists.isocpp.org</a>>
wrote:<u></u><u></u></span></p>
</div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-left:40.8pt"><span style="color:black">During review of some Unicode
stuff in LWG we had a mini discussion for some
folks about grapheme clusters and I mentioned
everyone who touches this stuff might understand
the complexities better if they read this:<u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:40.8pt"><span style="color:black"> <u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:40.8pt"><span style="color:black"><a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmanishearth.github.io%2Fblog%2F2017%2F01%2F14%2Fstop-ascribing-meaning-to-unicode-code-points%2F&data=02%7C01%7Cbion%40microsoft.com%7Caf98b04ab27042b257db08d767b26149%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637091888153726447&sdata=McYDWKyevonhNT732yikSFqDuAlbXKLBdGw5%2BBdgVJk%3D&reserved=0" target="_blank" rel="noreferrer">https://manishearth.github.io/blog/2017/01/14/stop-ascribing-meaning-to-unicode-code-points/</a><u></u><u></u></span></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">+1<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">FYI SG-16
is aware of that blog post and i think there is a
pretty strong agreement with it.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">Codepoints
have some use (notably the Unicode Character Database
is really the Unicode Codepoint Database, and most
Unicode algorithms works on codepoints), but any kind
of user facing UX should deal with EGCS.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">It is not
always what applications choose to do for a variety of
reasons. Notably Twitter character counts deals in
codepoints, web browsers search function use
codepoints as to ignore diacritics, and comparisons
can be done on (normalized) codepoint sequences.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">There is
also not always a 1-1 mapping between what people
understand as "character", grapheme clusters and
glyphes.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
</div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-left:40.8pt"><span style="color:black"> <u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:40.8pt"><span style="color:black">Billy3<u></u><u></u></span></p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal" style="margin-left:40.8pt"><span style="color:black">_______________________________________________<br>
Lib-Ext mailing list<br>
<a href="mailto:Lib-Ext@lists.isocpp.org" target="_blank" rel="noreferrer">Lib-Ext@lists.isocpp.org</a><br>
Subscription: <a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Flib-ext&data=02%7C01%7Cbion%40microsoft.com%7Caf98b04ab27042b257db08d767b26149%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637091888153736437&sdata=%2FCxf5Wy1XyiBIBTUa9Bkv8JDkcY4KoEApujBgPDDJ2c%3D&reserved=0" target="_blank" rel="noreferrer">
https://lists.isocpp.org/mailman/listinfo.cgi/lib-ext</a><br>
Link to this post: <a href="https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.isocpp.org%2Flib-ext%2F2019%2F11%2F13606.php&data=02%7C01%7Cbion%40microsoft.com%7Caf98b04ab27042b257db08d767b26149%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637091888153746432&sdata=XUFzgnDpGE6aZkvLCxc62Ppj1kVDEMP7R0TaOFXK0w8%3D&reserved=0" target="_blank" rel="noreferrer">
http://lists.isocpp.org/lib-ext/2019/11/13606.php</a><u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:black"><br>
<br>
<u></u><u></u></span></p>
<pre>_______________________________________________</pre>
<pre>Lib-Ext mailing list</pre>
<pre><a href="mailto:Lib-Ext@lists.isocpp.org" target="_blank" rel="noreferrer">Lib-Ext@lists.isocpp.org</a></pre>
<pre>Subscription: <a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Flib-ext&data=02%7C01%7Cbion%40microsoft.com%7Caf98b04ab27042b257db08d767b26149%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637091888153746432&sdata=%2BEoD3p3%2FtNQBdLJGEW%2BV9l0c3SQeF5lnjkimezW14Vg%3D&reserved=0" target="_blank" rel="noreferrer">https://lists.isocpp.org/mailman/listinfo.cgi/lib-ext</a></pre>
<pre>Link to this post: <a href="https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.isocpp.org%2Flib-ext%2F2019%2F11%2F13609.php&data=02%7C01%7Cbion%40microsoft.com%7Caf98b04ab27042b257db08d767b26149%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637091888153756427&sdata=MGtOpNxPCBZVy6L%2BCUw0UBmsv%2BBAeGVu49b01zQkpNU%3D&reserved=0" target="_blank" rel="noreferrer">http://lists.isocpp.org/lib-ext/2019/11/13609.php</a></pre>
</blockquote>
<p><u></u> <u></u></p>
<p class="MsoNormal"><span style="color:black"><u></u> <u></u></span></p>
</div>
</blockquote>
<p><br>
</p>
</div>
</blockquote></div></div></div>