<div dir="ltr"><div dir="ltr">On Sun, Apr 28, 2019 at 4:01 PM <<a href="mailto:keld@keldix.com">keld@keldix.com</a>> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
I believe there are a number of encodings in East Asia that there will still be<br>
developed for for quite some time.<br>
<br>
major languages and toolkits and operating systems are still character set independent.<br>
some people believe that unicode has not won, and some people are not happy with<br>
the unicode consortium. why abandon a model that still delivers for all?<br>
<br>
keld<br></blockquote><div><br></div><div>I think there's really only one thing that needs to be fixed, and that's the POSIX and C locales. Right now, they force a by-requirement 256 single-byte encoding. (Chapter 6, Section 2, first sentence: <a href="http://pubs.opengroup.org/onlinepubs/9699919799/">http://pubs.opengroup.org/onlinepubs/9699919799/</a>).<br><br></div><div>This restriction is what has been utterly and absolutely destroying the ability to behave properly with a large set of encodings deployed around the world, including Unicode, as a default. I am actually spending time and cycles now contacting people on the C Standards Committee and reaching out to people to find the POSIX individuals responsible for overseeing this standard: that the locale is a single-byte encoding is not "character set independent": it means that only a small fraction (ASCII, or similar) can possibly be the default C or POSIX locale. That Unicode (specifically, UTF8) happens to work in C and C++ is because the defaults for many of the implementations simply pass char/wchar_t/char16_t/char32_t through their interfaces and do not touch it. But, the moment anyone uses facets or locales in any meaningful manner, much of it falls over.</div><div><br></div><div>POSIX/C need to acknowledge that multibyte encodings are reasonable defaults (not just recommended extensions, but plausible defaults). Until then, no: the C standard does not deliver for all and actively harms the development and growth of international text processing on large and small hardware systems.<br></div></div></div></div>