[SG16-Unicode] [isocpp-lib] [isocpp-lib-ext] [time.duration.io] : Is stream insertion behavior locale dependent when Period::type is micro?

Tom Honermann tom at honermann.net
Wed Nov 6 18:38:34 CET 2019


On 11/6/19 5:30 PM, Billy O'Neal (VC LIBS) wrote:
>
> Corentin’s PR says “if char (the execution encoding) can always 
> represent µ for your implementation, use that. Otherwise use u.” Which 
> means on my implementation where char can’t always represent such a 
> thing as that is locale dependent. we will statically use u (and µ for 
> wchar_t); but an implementation that assumes char is UTF-8 could use µ.
>
> The LWG issue’s PR says “if the stream can detect that it is targeting 
> a console or codecvt facet that don’t support µ, an implementation  
> may use u, otherwise they use µ”. But streams have no means of doing 
> that detection. (And the answer can even change if someone changes the 
> streambuf)
>
That isn't what it (is intended to) say, nor how I read it.  It states 
that the suffix is determined by the execution character set (the 
character set used for string literals and known at compile time); that 
is in the first sentence.  The second sentence acknowledges that if the 
native character set (the run-time locale dependent character set) lacks 
representation for the character, then all bets are off with regard to 
how the character is actually displayed (or converted by a codecvt facet).

The intent of the wording was to allow Microsoft to use "µs" when the 
compiler is invoked with /execution-charset:utf-8 and to use "us" otherwise.

Tom.

> Billy3
>
> ------------------------------------------------------------------------
> *From:* Tom Honermann <tom at honermann.net>
> *Sent:* Wednesday, November 6, 2019 5:14:18 PM
> *To:* Billy O'Neal (VC LIBS) <bion at microsoft.com>; 
> lib at lists.isocpp.org <lib at lists.isocpp.org>; Corentin 
> <corentin.jabot at gmail.com>
> *Cc:* C++ Library Evolution Working Group <lib-ext at lists.isocpp.org>; 
> unicode at isocpp.open-std.org <unicode at open-std.org>
> *Subject:* Re: [isocpp-lib] [SG16-Unicode] [isocpp-lib-ext] 
> [time.duration.io] : Is stream insertion behavior locale dependent 
> when Period::type is micro?
> On 11/6/19 4:30 PM, Billy O'Neal (VC LIBS) wrote:
>>
>> > Please read the wording again. Note that it says that, if those 
>> conditions are true, then the result is unspecified.
>>
>> If “the wording” means the P/R of 
>> https://cplusplus.github.io/LWG/issue3314 
>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcplusplus.github.io%2FLWG%2Fissue3314&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639648909&sdata=7GBDeNaCU2%2B64mxLltJkFfdRuDybqgmOYQA4RPFWHRI%3D&reserved=0>, 
>> the wording there implies that we must make some effort to determine 
>> that the condition is true, which in practice we cannot do because 
>> the interface between streams and streambufs is public.
>>
> Yes, that is the wording I meant.  The intent is to ensure the 
> implementation does *not* have to put forth such effort. I don't 
> understand where such an implication is coming from, but that wording 
> has confused at least three experienced wordsmiths, so I acknowledge 
> there is an issue, but I don't understand what it is.
>
> I think it is important to say something here.  Otherwise, one could 
> claim that the terminal failing to display "μs" because it is 
> configured for an incompatible encoding is non-conforming.  Well, to 
> the extent that the standard addresses such devices.
>
> Tom.
>
>> Corentin’s P/R below seems to not have this concern.
>>
>> Billy3
>>
>> ------------------------------------------------------------------------
>> *From:* Lib <lib-bounces at lists.isocpp.org> 
>> <mailto:lib-bounces at lists.isocpp.org> on behalf of Tom Honermann via 
>> Lib <lib at lists.isocpp.org> <mailto:lib at lists.isocpp.org>
>> *Sent:* Wednesday, November 6, 2019 1:12:48 PM
>> *To:* Corentin <corentin.jabot at gmail.com> 
>> <mailto:corentin.jabot at gmail.com>
>> *Cc:* Tom Honermann <tom at honermann.net> <mailto:tom at honermann.net>; 
>> C++ Library Evolution Working Group <lib-ext at lists.isocpp.org> 
>> <mailto:lib-ext at lists.isocpp.org>; Library Working Group 
>> <lib at lists.isocpp.org> <mailto:lib at lists.isocpp.org>; 
>> unicode at isocpp.open-std.org <mailto:unicode at isocpp.open-std.org> 
>> <unicode at open-std.org> <mailto:unicode at open-std.org>
>> *Subject:* Re: [isocpp-lib] [SG16-Unicode] [isocpp-lib-ext] 
>> [time.duration.io] : Is stream insertion behavior locale dependent 
>> when Period::type is micro?
>> The intent of the wording is to say that implementors do *not* need 
>> to be aware of terminals or codecvt facets. Without this, the wording 
>> could be read that implementations must implement magic to make the 
>> character display correctly.
>>
>> Please read the wording again. Note that it says that, if those 
>> conditions are true, then the result is unspecified.
>>
>> Tom.
>>
>> On Nov 6, 2019, at 12:07 PM, Corentin <corentin.jabot at gmail.com 
>> <mailto:corentin.jabot at gmail.com>> wrote:
>>
>>> Then I would just say associated execution encoding with charT
>>>
>>> Extremely uncomfortable with involving stream, console or anything 
>>> else not known at compile time
>>>
>>> On Wed, 6 Nov 2019 at 04:51, Tom Honermann <tom at honermann.net 
>>> <mailto:tom at honermann.net>> wrote:
>>>
>>>     On 11/6/19 8:30 AM, Howard Hinnant wrote:
>>>>     You can comment the LWG issue (if you want) by emailing said comment tolwgchair at gmail.com  <mailto:lwgchair at gmail.com>, specifying which issue you wish to comment and supplying the comment.
>>>>
>>>>     Howard
>>>>
>>>>     On Nov 5, 2019, at 10:32 PM, Corentin via Lib-Ext<lib-ext at lists.isocpp.org>  <mailto:lib-ext at lists.isocpp.org>  wrote:
>>>>>     Not sure how to do that proceduraly but here is some alternative wording.
>>>>>     The "runtime" locale-tied encoding is *assumed to be* a super set of the execution encoding - to the extent the standard doesn't distinguish between the two
>>>>>
>>>>>
>>>>>     If Period::type is micro, but the <ins>abstract</ins> character <ins>µ , which has the universal character name </ins> U+00B5 cannot be represented in the <ins>execution</ins> encoding <del>used for</del><ins> associated with the character type </ins> charT, the unit suffix "us" is used instead of "µs".
>>>
>>>     Howard and I discussed the wording I proposed today and we're
>>>     now on the same page with regard to the intent.
>>>
>>>     With regard to Corentin's suggested wording above, "abstract
>>>     character" and "execution encoding" are not current terms in the
>>>     standard (well, the former is inherited from our reference to
>>>     the Unicode standard but is otherwise unused at present).
>>>     P1859R0
>>>     <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwg21.link%2Fp1859r0&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639658899&sdata=ZRSB0GKnWzhgzQArFZMtAvhJ912CIBttXg1lXijUgjQ%3D&reserved=0>
>>>     does intend to standardize new terminology, but we don't yet
>>>     have consensus for what the new terms should be named.  I think
>>>     we should avoid using candidate names until we have such consensus.
>>>
>>>     Tom.
>>>
>>>>>>     On Mon, 4 Nov 2019 at 15:42, Tom Honermann via Lib-Ext<lib-ext at lists.isocpp.org>  <mailto:lib-ext at lists.isocpp.org>  wrote:
>>>>>>     A new LWG issue was filed for this question today:
>>>>>>     -https://cplusplus.github.io/LWG/issue3314  <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcplusplus.github.io%2FLWG%2Fissue3314&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639668895&sdata=uC3YRt9nkh4RHBan05r7vgp80RqZ9MTT3OS9H8MEDFY%3D&reserved=0>
>>>>>>
>>>>>>     This issue concerns the ostream inserters added for std::chrono::duration in C++20 and what the intended behavior is for a duration when period::type is micro.
>>>>>>
>>>>>>     [time.duration.io  <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftime.duration.io&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639668895&sdata=KL8P5jLXGV%2FXWF1%2F11L8IiiCHY1Nf3IRL%2Fhnu%2BVtMcM%3D&reserved=0>]p4 states:
>>>>>>
>>>>>>
>>>>>>>     If Period​::​type is micro, but the character U+00B5 cannot be represented in the encoding used for charT,           the unit suffix "us" is used instead of "μs".
>>>>>>>
>>>>>>     The question is with regard to which one of the encodings used for charT is referred to here; the compile-time execution character set or the run-time locale dependent native character set?
>>>>>>
>>>>>>     The proposed resolution specifies that the compile-time execution character set is the intended one.  My expectation is that this aligns with existing implementations, but I haven't checked.
>>>>>>
>>>>>>     Tom.
>>>>>>
>>>>>     _______________________________________________
>>>>>     Lib-Ext mailing list
>>>>>     Lib-Ext at lists.isocpp.org  <mailto:Lib-Ext at lists.isocpp.org>
>>>>>     Subscription:https://lists.isocpp.org/mailman/listinfo.cgi/lib-ext  <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Flib-ext&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639678891&sdata=q8STNMI8xDAbgpdthf8VgIbgNvzADEDmnmLJRYzQ8uc%3D&reserved=0>
>>>>>     Link to this post:http://lists.isocpp.org/lib-ext/2019/11/13309.php  <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.isocpp.org%2Flib-ext%2F2019%2F11%2F13309.php&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639688887&sdata=ZtRjYdN7d%2FeiuJezp8vyxOOtsOOZc0%2BqDXGoZq2tqCI%3D&reserved=0>
>>>>>     _______________________________________________
>>>>>     Lib-Ext mailing list
>>>>>     Lib-Ext at lists.isocpp.org  <mailto:Lib-Ext at lists.isocpp.org>
>>>>>     Subscription:https://lists.isocpp.org/mailman/listinfo.cgi/lib-ext  <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.isocpp.org%2Fmailman%2Flistinfo.cgi%2Flib-ext&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639688887&sdata=o37Ay5VWpZ7hqbglGhSpsl0DbpKNm%2BP4YckKbNMvgCs%3D&reserved=0>
>>>>>     Link to this post:http://lists.isocpp.org/lib-ext/2019/11/13325.php  <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.isocpp.org%2Flib-ext%2F2019%2F11%2F13325.php&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639698882&sdata=DWLr80dJM2Zqg5Ad44m%2B%2BvQLN8sO%2B37DyJRnX5iDIas%3D&reserved=0>
>>>>
>>>>     _______________________________________________
>>>>     SG16 Unicode mailing list
>>>>     Unicode at isocpp.open-std.org  <mailto:Unicode at isocpp.open-std.org>
>>>>     http://www.open-std.org/mailman/listinfo/unicode  <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.open-std.org%2Fmailman%2Flistinfo%2Funicode&data=02%7C01%7Cbion%40microsoft.com%7C9247a018118d4bfe9a0508d762dcc3f7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637086572639708883&sdata=RQAtQ3ROTPGMOSZTLZjXw%2BdgQWD4FYdftkbYd4L4c24%3D&reserved=0>
>>>
>>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20191106/4707606a/attachment-0001.html 


More information about the Unicode mailing list