[SG16-Unicode] code_unit_sequence and code_point_sequence

keld at keldix.com keld at keldix.com
Wed Jun 20 19:24:52 CEST 2018


On Wed, Jun 20, 2018 at 12:13:41PM -0400, Tom Honermann wrote:
> On 06/20/2018 05:34 AM, keld at keldix.com wrote:
> >On Tue, Jun 19, 2018 at 09:52:05PM -0400, Tom Honermann wrote:
> >>On 06/19/2018 04:19 PM, Lyberta wrote:
> >>>keld at keldix.com:
> >
> >Using a 16 bit wchar_t is ok if you restrict yourself to only a 16 bit 
> >subset of UCS.
> 
> I don't disagree, but for modern applications, limiting support to the 
> BMP is a pretty significant restriction.  And modern applications need 
> to work on Windows and interact with the wchar_t based Win32 UTF-16 APIs.

I agree that this is not the state of the art. But it once was, and I think it is the reason for
Microsoft to use 16 bit for wchar_t.

> >I am happy to have a specific type to handle code points that are defined 
> >to have
> >UCS code point values. I just note that I think APIs to handle such a type 
> >would need to
> >have exactly the same functionality as for handling wchar_t entities.
> 
> If I'm reading this correctly, it sounds like you are expressing a 
> preference that text interfaces should be consistently provided for 
> char, wchar_t, char16_t, char32_t (and char8_t).  If so, I agree.

My thoughts were only wchar_t and char32_t, The other types would need another layer
- they cannot generally hold a code point of the processing character type. So they
cannot be used for portable programs that can be used everywhere.

Most programs I work with are made for the  global market, and IMHO, you
should program for the global market.

Best regards
keld


More information about the Unicode mailing list