[SG16-Unicode] [isocpp-core] What is the proper term for the locale dependent run-time character set/encoding used for the character classification and conversion functions?

Tom Honermann tom at honermann.net
Tue Aug 13 18:07:48 CEST 2019


> On Aug 13, 2019, at 10:20 AM, Niall Douglas via Core <core at lists.isocpp.org> wrote:
> 
>> On 13/08/2019 09:38, Niall Douglas via Core wrote:
>> Before progressing with a solution, can I ask the question:
>> 
>> Is it politically feasible for C++ 23 and C 2x to require
>> implementations to default to interpreting source files as either (i) 7
>> bit ASCII or (ii) UTF-8? To be specific, char literals would thus be
>> either 7 bit ASCII or UTF-8.
> 
> I see that nobody has said no to this proposal yet. Yes I agree with
> Corentin that escaped characters within literals are fine, you don't
> even need a UTF library in the compiler for those, so small C compiler
> folk won't complain.
> 
> If nobody from WG21 objects to this proposal, shall I go ask WG14?

I object, but don’t have time to respond further right now. There are existing implementations where, by default, source files are assumed to be encoded with some EBCDIC code page. I don’t want to break those implementations, nor impose the significant burden such a change would place on users of those implementations. 

This is (another) tangent to the original question. Source file encoding has nothing to do with execution encoding. 

Tom. 
> 
> Because if they also don't object, then there is green grass ahoy.
> 
> Niall
> _______________________________________________
> Core mailing list
> Core at lists.isocpp.org
> Subscription: https://lists.isocpp.org/mailman/listinfo.cgi/core
> Link to this post: http://lists.isocpp.org/core/2019/08/7037.php



More information about the Unicode mailing list