[SG16-Unicode] Strong code unit types
Steve Downey
sdowney at gmail.com
Wed Dec 5 14:05:01 CET 2018
`codepoint` also, which is probably "just" a char32_t?
On Wed, Dec 5, 2018, 01:40 Tom Honermann <tom at honermann.net wrote:
> On 12/4/18 11:17 PM, Lyberta wrote:
> > This is something that hit me recently. Why are we using fundamental
> > types for code units? CppCon 2018 is full of people saying that we
> > should migrate to strong types, that std::size_t should have been a
> > struct, etc.
> The primary reason for using fundamental types for code units is that
> those are the types used for character and string literals.
> >
> > I propose we add strong types for code units:
> >
> > * utf8_code_unit
> > * utf16_code_unit
> > * utf32_code_unit
> >
> > These will hold char8,16,32_t inside of them respectively but will not
> > allow the invalid values such as >245 for UTF-8, surrogates and
> >> 0x10FFFF for UTF-32, etc.
> > This will guarantee that all code units are valid and will allow us to
> > write much faster code because we will never need to check for invalid
> > values.
>
> The downside of such validating types is the validation overhead.
>
> I am in favor of introducing strong types for code points.
>
> Tom.
>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode at isocpp.open-std.org
> http://www.open-std.org/mailman/listinfo/unicode
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20181205/78a8f64d/attachment.html
More information about the Unicode
mailing list