[SG16-Unicode] Identifiers in C++

Tom Honermann tom at honermann.net
Wed May 15 19:39:06 CEST 2019


Thanks for bringing this to our attention.  I agree there are 
opportunities for improvement here. I filed a new SG16 issue to track this.

https://github.com/sg16-unicode/sg16/issues/48

I encourage anyone interested in this to sign up to write a paper or 
provide additional background material in the issue (e.g., more history 
about the current list of ranges, an analysis of UAX#31 and its 
applicability to C++, etc...).

Tom.

On 5/10/19 12:43 PM, JF Bastien wrote:
> Hi C++ પกٱƈѻɗﻉ ḟäṅṡ 👋!
>
> The current list of valid identifier characters is pretty silly (see 
> [*lex.name <http://lex.name>*] 5.10 Identifiers or cppreference 
> summary <https://en.cppreference.com/w/cpp/language/identifiers>). It 
> allows characters such as zero-width joiner and zero-width space among 
> a few silly things (see how bad this can get 
> <https://godbolt.org/z/sBJk1k>, h/t Richard Kogelnig).
>
> I asked where it came from, and IIUC John looked at Unicode and 
> cobbled the list of valid ranges manually. That ain't great.
>
> Is this group interested in fixing things?
>
> There's already an existing standard for this, maybe it's a thing we 
> can adopt as-is or use as a starting point:
>
>     https://unicode.org/reports/tr31/
>
>
> Further, the tooling group was just talking about module names. I 
> think we should allow any valid identifier name as module name, and 
> look at how this could map to file names for a tooling TR's purpose.
>
> Thanks,
>
> J̙̘̗̘̟͐̀̎F͚̜͈̖͉̗̘̊
>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode at isocpp.open-std.org
> http://www.open-std.org/mailman/listinfo/unicode


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20190515/56495687/attachment.html 


More information about the Unicode mailing list