[SG16-Unicode] [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing Meaning to Code Points" blog post

Billy O'Neal (VC LIBS) bion at microsoft.com
Thu Nov 14 02:38:18 CET 2019


>IMO, this is the wrong way to think about stability w.r.t Unicode.  The changes that happen to Unicode are bug fixes.  If they change the results users get when they use a certain API, it's a fix, not a regression.

I agree that it isn't a regression. Whether it is a fix or not has nothing to do with whether it is a breaking change; it's breaking if anyone relies on the behavior that is broken. We have customers that are angry with us because we fixed printf to print doubles correctly. And we had to ship a mode for those customers to make printf be broken again.

>>>It is important to remember that width estimation is orthogonal to memory safety; format_to_n() is there to give you the memory safety part, and that will never be impacted by the width estimation piece.
>>I agree, but the same is true of sprintf vs. snprintf.
>That sounds right to me, but I don't get the implication.  Why did you bring it up?

The implication is that if the customers of format/format_to/format_to_n are anything like the customers of sprintf/snprintf, there will be users who call the not sized interface expecting the format string to make it safe.

Billy3
________________________________
From: Zach Laine <whatwasthataddress at gmail.com>
Sent: Wednesday, November 13, 2019 04:19 PM
To: Billy O'Neal (VC LIBS) <bion at microsoft.com>
Cc: Library Working Group <lib at lists.isocpp.org>; Kirk Shoop <kirkshoop at fb.com>; lib-ext at lists.isocpp.org <lib-ext at lists.isocpp.org>; Titus Winters <titus at google.com>; Victor Zverovich <victor.zverovich at gmail.com>; Corentin <corentin.jabot at gmail.com>; Tom Honermann <tom at honermann.net>; SG16 <unicode at open-std.org>
Subject: Re: [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing Meaning to Code Points" blog post

On Wed, Nov 13, 2019 at 1:28 PM Billy O'Neal (VC LIBS) <bion at microsoft.com<mailto:bion at microsoft.com>> wrote:
>Will you be hesitant to update the reference to the grapheme breaking algorithm if it changes in future Unicode standards as well?

Yes. There's a reason why, for example, Java doesn't follow Unicode's rules in its regex implementation, because it would be a breaking change to do that.

IMO, this is the wrong way to think about stability w.r.t Unicode.  The changes that happen to Unicode are bug fixes.  If they change the results users get when they use a certain API, it's a fix, not a regression.  Adding an 8-width (or whatever it turns out to be) entry in the table for U+FDFD in a later standard falls into that category.

>It is important to remember that width estimation is orthogonal to memory safety; format_to_n() is there to give you the memory safety part, and that will never be impacted by the width estimation piece.

I agree, but the same is true of sprintf vs. snprintf.

Billy3

That sounds right to me, but I don't get the implication.  Why did you bring it up?

Zach

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20191114/ca5068dd/attachment.html 


More information about the Unicode mailing list