[SG16-Unicode] code_unit_sequence and code_point_sequence
Tom Honermann
tom at honermann.net
Wed Jun 20 03:52:05 CEST 2018
On 06/19/2018 04:19 PM, Lyberta wrote:
> keld at keldix.com:
>> Is your code point advisory the same as codepoints in 10646/Unicode, also
>> called characters in 10646?
> Yes. A code point is unsigned 32 bit integer with the values in the
> range of 0-10FFFF. Modern C and C++ have type char32_t which is most
> suitable for holding code points.
>
>> And why not just treat these as 32-bit wchar-t?
>> I believe this is what we do in C.
> Because wide execution character set is implementation defined. So far
> nobody has expressed opinion of changing that and Windows violates the
> standard by having 16 bit wchar_t.
Technically, Windows doesn't violate the standard by having a 16-bit
wchar_t. It violates the standard by using a wide execution character
set that defines code points that do not fit in it's (16-bit) wchar_t
type. We have an issue (https://github.com/sg16-unicode/sg16/issues/9)
to track modifying the standard to enable Microsoft's implementation to
be conforming.
Tom.
More information about the Unicode
mailing list