[SG16-Unicode] Hidden locale dependency in [time.duration.io]?

Tom Honermann tom at honermann.net
Mon Nov 4 11:06:05 CET 2019


On 11/4/19 9:40 AM, Steve Downey wrote:
> I believe the wording around locale is merely warning that if μs isn't 
> supported by the locale associated with a stream, then the results are 
> unspecified, which is true, but unhelpful, and probably does not need 
> to be in the normative wording for this.
I think it is helpful to make it clear that the implementation does not 
(should not) make such cases "work".
>
> I'm unaware of any implementation that supports checking if string 
> literals are actually encodable. All implementations are requirged to 
> at least track \u00b5 until literals are encoded. This sound like an 
> implementation that supports targeting non-unicode encodings of 
> literals, such as MSVC, will have to use "us".

I believe gcc at least will warn in cases where the source encoding and 
execution encoding are not the same.

I would argue that MSVC can use "μs" when compiling with the 
/execution-charset:utf-8 or /utf-8 options (implicitly or explicitly) 
enabled.

Tom.

>
> On Mon, Nov 4, 2019 at 9:03 AM Howard Hinnant 
> <howard.hinnant at gmail.com <mailto:howard.hinnant at gmail.com>> wrote:
>
>
>     On Nov 4, 2019, at 8:45 AM, Tom Honermann <tom at honermann.net
>     <mailto:tom at honermann.net>> wrote:
>     >
>     > On 11/4/19 7:18 AM, Howard Hinnant wrote:
>     >> On Nov 4, 2019, at 12:27 AM, Tom Honermann <tom at honermann.net
>     <mailto:tom at honermann.net>> wrote:
>     >>> I suggest the following wording: (using terminology from P1859R0)
>     >>>
>     >>> If Period​::​type is micro, but the character U+00B5
>     <del>cannot be represented in the encoding used</del><ins>lacks
>     representation in the execution character set</ins> for charT, the
>     unit suffix "us" is used instead of "μs".  <ins>If
>     >>> "μs" is used but the dynamic encoding lacks representation for
>     U+00B5 and the stream is associated with a terminal or console, or
>     if the stream is imbued with a std::codecvt facet that lacks
>     conversion support for the character, then the  result is
>     unspecified.</ins>
>     >>>
>     >> I’ve no objection to an issue, but your proposed wording
>     explicitly involves two things I’m strongly against:
>     >>
>     >> 1.  Now the code has to check the locale, for this precision only.
>     >>
>     >> 2.  Now the code has different behavior between cout and
>     ostringstream.  And the result of ostringstream is very commonly
>     subsequently sent to cout (ostringstream is a common formatting aid).
>     >>
>     >> Imo, the proposed wording is much, much worse than the
>     status-quo and I would vote strongly against it.
>     >
>     > No, the wording I proposed doesn't check for locale.  The
>     execution character set is the character set used for string
>     literals and is known at compile time; it is not the locale
>     dependent run-time character set.
>
>
>     Here is the processed form of what you wrote (the deletes deleted,
>     the inserts inserted):
>
>     If Period​::​type is micro, but the character U+00B5 lacks
>     representation in the execution character set for charT, the unit
>     suffix "us" is used instead of "μs".  If "μs" is used but the
>     dynamic encoding lacks representation for U+00B5 and the stream is
>     associated with a terminal or console, or if the stream is imbued
>     with a std::codecvt facet that lacks conversion support for the
>     character, then the  result is unspecified.
>
>     The phrase "or if the stream is imbued with a std::codecvt facet
>     that…” implies that the implementation gets the locale of the
>     stream, extracts the codecvt facet from it, and does something
>     with it.
>
>     I do not believe the streaming of durations of any precision
>     should involve the stream’s locale.
>
>     For microseconds precision the suffix should be “μs”, but at the
>     vendor’s discretion may be “us” instead.
>
>     I’m open to better ways of saying the sentence above.  The above
>     sentence doesn’t (and shouldn’t) be stream-dependent or locale
>     dependent.  It should not involve properties of the codecvt facet.
>
>     Howard
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20191104/95ca4ea6/attachment.html 


More information about the Unicode mailing list