[SG16-Unicode] D1628R0 (Unicode character properties)

Steve Downey sdowney at gmail.com
Thu Mar 28 13:58:03 CET 2019


These are plumbing for writing unicode algorithms portably, such as regex.
I don't think they are something most users should interact with. As such,
I think the wide contract is appropriate. It's reasonable for the binary
properties to say false for non code point values.

I also, on reflection, think excluding char and wchar is misguided. The
contract is that char8 etc are utf-8 etc, but it's also frequently the case
that the execution narrow and wide encodings are also unicode. That you
have possible GIGO errors isn't a good reason to block possible correct
use. It just encourages casting.

On Thu, Mar 28, 2019, 05:10 Lyberta <lyberta at lyberta.net> wrote:

> >> I guess, but do we really want our users to shove random integers in it
> >>
> >
> > Yes. I really want a wide contract there
>
> But... why?
>
> >> Yes, contract or invariant means strong type, not dumb char32_t
> >>
> >
> > TR 44 is purposefully dumb by design too.
>
> I guess it was written by people with more of a C mindset. I'm looking
> at std::chrono and love how I can never shove an integer there because
> it is ambiguous. Same with text - an integer is ambiguous without
> character set or encoding. I know this api has Unicode in its name
> but... I think I gotta try to come up with properties design that is
> compatible with my design and see if there are any bad points.
>
> Also, I know this is a bit obscure, but what about non-Unicode? I think
> having relatively universal free functions is fine and then if they get
> std::unicode_code_point as template parameter, they will select unicode
> implementation. Hence again, strong types are important.
>
> Also, consider std::ascii_character, std::shift_jis_something.. I don't
> know Shift-JIS. :/
>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode at isocpp.open-std.org
> http://www.open-std.org/mailman/listinfo/unicode
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20190328/22a96b29/attachment.html 


More information about the Unicode mailing list