[SG16-Unicode] Hidden locale dependency in [time.duration.io]?

Jean-Marc Bourguet jm at bourguet.org
Mon Nov 4 09:57:32 CET 2019


On 04.11.2019 09:45, Tom Honermann wrote:
> On 11/4/19 7:18 AM, Howard Hinnant wrote:
>> On Nov 4, 2019, at 12:27 AM, Tom Honermann <tom at honermann.net> wrote:
>>> I suggest the following wording: (using terminology from P1859R0)
>>> 
>>> If Period​::​type is micro, but the character U+00B5 <del>cannot be 
>>> represented in the encoding used</del><ins>lacks representation in 
>>> the execution character set</ins> for charT, the unit suffix "us" is 
>>> used instead of "μs".  <ins>If
>>> "μs" is used but the dynamic encoding lacks representation for U+00B5 
>>> and the stream is associated with a terminal or console, or if the 
>>> stream is imbued with a std::codecvt facet that lacks conversion 
>>> support for the character, then the  result is unspecified.</ins>
>>> 
>> I’ve no objection to an issue, but your proposed wording explicitly 
>> involves two things I’m strongly against:
>> 
>> 1.  Now the code has to check the locale, for this precision only.
>> 
>> 2.  Now the code has different behavior between cout and 
>> ostringstream.  And the result of ostringstream is very commonly 
>> subsequently sent to cout (ostringstream is a common formatting aid).
>> 
>> Imo, the proposed wording is much, much worse than the status-quo and 
>> I would vote strongly against it.
> 
> No, the wording I proposed doesn't check for locale.  The execution
> character set is the character set used for string literals and is 
> known
> at compile time; it is not the locale dependent run-time character set.

lex.charset/3 states

     The values of the members of the execution character sets and the 
sets of additional members are locale-specific.

apparently making the execution character sets run-time dependent.

But lex.ccon/2 states

     An ordinary character literal that contains a single c-char 
representable in the execution character set has type char, with value 
equal to the numerical value of the encoding of the c-char in the 
execution character set.

apparently making it fixed.

I've not looked at that more in-depth to see which interpretation is the 
more pervasive.

Yours,

-- Jean-Marc Bourguet


More information about the Unicode mailing list