[SG16-Unicode] [isocpp-core] What is the proper term for the locale dependent run-time character set/encoding used for the character classification and conversion functions?

Corentin Jabot corentinjabot at gmail.com
Wed Aug 14 08:49:11 CEST 2019


On Wed, Aug 14, 2019, 4:46 AM Tony V E <tvaneerd at gmail.com> wrote:

>
>
> On Tue, Aug 13, 2019 at 8:57 AM Corentin Jabot <corentinjabot at gmail.com>
> wrote:
>
>>
>>
>> On Tue, 13 Aug 2019 at 14:52, Ville Voutilainen <
>> ville.voutilainen at gmail.com> wrote:
>>
>>> On Tue, 13 Aug 2019 at 15:35, Corentin Jabot via Core
>>> <core at lists.isocpp.org> wrote:
>>> >
>>> >
>>> > Chiming in with my favorite solution:> Forbid u8/u16/u32 literals in
>>> non unicode encoded files
>>>
>>> But presumably not the ones that look like u8"\U1234" ?
>>>
>>
>> Yes, there is no reason to disallow that as It can't be misinterpreted by
>> neither the compiler or people (and quite a lot of code would needlessly
>> break)
>>
>>
> I find your lack of faith in people's ability to misinterpret something
> disturbing.
> :-)
>

😁 (Challenging your mail client)


\Uxxxx is unambiguous.

u8"é" is ambiguous. Both people and the compiler may interpret that in a
variety of ways. Notably if I have utf-8 in that file, which I wrote on
Linux, but then the msvc compiler thinks it's windows 1252...
Mojibake.


People also seem to be confused

https://stackoverflow.com/questions/23471935/how-are-u8-literals-supposed-to-work


> --
> Be seeing you,
> Tony
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20190814/4a8194a8/attachment.html 


More information about the Unicode mailing list