[SG16-Unicode] [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing Meaning to Code Points" blog post

Victor Zverovich victor.zverovich at gmail.com
Thu Nov 14 02:45:35 CET 2019


format and format_to_n will be safe and format_to can be used unsafely
regardless of the changes to width or anything else. To quote Titus "every
change is a breaking change" and I think we should explicitly reserve the
right to change width estimation and update the Unicode database.

- Victor

On Wed, Nov 13, 2019 at 5:38 PM Billy O'Neal (VC LIBS) <bion at microsoft.com>
wrote:

> >IMO, this is the wrong way to think about stability w.r.t Unicode.  The
> changes that happen to Unicode are bug fixes.  If they change the results
> users get when they use a certain API, it's a fix, not a regression.
>
> I agree that it isn't a regression. Whether it is a fix or not has nothing
> to do with whether it is a breaking change; it's breaking if anyone relies
> on the behavior that is broken. We have customers that are angry with us
> because we fixed printf to print doubles correctly. And we had to ship a
> mode for those customers to make printf be broken again.
>
> >>>It is important to remember that width estimation is orthogonal to
> memory safety; format_to_n() is there to give you the memory safety part,
> and that will never be impacted by the width estimation piece.
> >>I agree, but the same is true of sprintf vs. snprintf.
> >That sounds right to me, but I don't get the implication.  Why did you
> bring it up?
>
> The implication is that if the customers of format/format_to/format_to_n
> are anything like the customers of sprintf/snprintf, there will be users
> who call the not sized interface expecting the format string to make it
> safe.
>
> Billy3
> ------------------------------
> *From:* Zach Laine <whatwasthataddress at gmail.com>
> *Sent:* Wednesday, November 13, 2019 04:19 PM
> *To:* Billy O'Neal (VC LIBS) <bion at microsoft.com>
> *Cc:* Library Working Group <lib at lists.isocpp.org>; Kirk Shoop <
> kirkshoop at fb.com>; lib-ext at lists.isocpp.org <lib-ext at lists.isocpp.org>;
> Titus Winters <titus at google.com>; Victor Zverovich <
> victor.zverovich at gmail.com>; Corentin <corentin.jabot at gmail.com>; Tom
> Honermann <tom at honermann.net>; SG16 <unicode at open-std.org>
> *Subject:* Re: [isocpp-lib] [isocpp-lib-ext] The "Let's Stop Ascribing
> Meaning to Code Points" blog post
>
> On Wed, Nov 13, 2019 at 1:28 PM Billy O'Neal (VC LIBS) <bion at microsoft.com>
> wrote:
>
> >Will you be hesitant to update the reference to the grapheme breaking
> algorithm if it changes in future Unicode standards as well?
>
> Yes. There's a reason why, for example, Java doesn't follow Unicode's
> rules in its regex implementation, because it would be a breaking change to
> do that.
>
>
> IMO, this is the wrong way to think about stability w.r.t Unicode.  The
> changes that happen to Unicode are bug fixes.  If they change the results
> users get when they use a certain API, it's a fix, not a regression.
> Adding an 8-width (or whatever it turns out to be) entry in the table for
> U+FDFD in a later standard falls into that category.
>
>
> >It is important to remember that width estimation is orthogonal to
> memory safety; format_to_n() is there to give you the memory safety part,
> and that will never be impacted by the width estimation piece.
>
> I agree, but the same is true of sprintf vs. snprintf.
>
> Billy3
>
>
> That sounds right to me, but I don't get the implication.  Why did you
> bring it up?
>
> Zach
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20191114/c92a61c7/attachment-0001.html 


More information about the Unicode mailing list