[SG16-Unicode] Strong code unit types

Lyberta lyberta at lyberta.net
Wed Dec 5 05:17:00 CET 2018


This is something that hit me recently. Why are we using fundamental
types for code units? CppCon 2018 is full of people saying that we
should migrate to strong types, that std::size_t should have been a
struct, etc.

I propose we add strong types for code units:

* utf8_code_unit
* utf16_code_unit
* utf32_code_unit

These will hold char8,16,32_t inside of them respectively but will not
allow the invalid values such as >245 for UTF-8, surrogates and
>0x10FFFF for UTF-32, etc.

This will guarantee that all code units are valid and will allow us to
write much faster code because we will never need to check for invalid
values.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
Url : http://www.open-std.org/pipermail/unicode/attachments/20181205/1515121f/attachment.bin 


More information about the Unicode mailing list