[SG16-Unicode] Draft SG16 direction paper

Tom Honermann tom at honermann.net
Tue Oct 9 04:59:54 CEST 2018


On 10/08/2018 10:05 PM, Markus Scherer wrote:
> On Mon, Oct 8, 2018 at 6:14 PM Lyberta <lyberta at lyberta.net 
> <mailto:lyberta at lyberta.net>> wrote:
>
>     > If you do want a distinct type, why not just standardize on
>     uint8_t? Why
>     > does it need to be a new type that is distinct from that, too?
>
>     Here's a small example why both "char" and "uint8_t" are horrible
>     types
>     as implemented now on all major implementations:
>
>
>     std::uint8_t small_number = 65;
>     std::cout << small_number << '\n';
>
>     This will print "A" instead of 65 on all implementations I've
>     tested it
>     on. This breaks templates that do text processing.
>
>
> Hm? This has very little to do with text processing. Someone made a 
> choice that ostream <<  small number yields a character.

I think it is a relevant example.  The concern is the inability to 
differentiate character data vs numeric data due to uint8_t being a 
typedef of another type that might be a character type (unsigned char) 
or numeric type (unsigned short) depending on the implementation.

>
> If you want specific formatting of a value, you implement and call a 
> value formatter function that returns a string.
> Or you define a value class and define << for it.

I agree this is a better practice.

>
>     Personally, I think
>     we need to add "char8_t" and also a "shortest" type so implementations
>     can use "unsigned shortest" to implement std::uint8_t.
>
>
> That's called uint_least8_t, right? If it was smaller than 8 bits, it 
> would be useless for UTF-8.

I believe Lyberta is arguing for a small builtin integer type that is 
not defined in terms of a type that may be confused as being a character 
type.

>
> And I doubt that there is any platform that supports C++11 or higher 
> and where uint8_t != uint_least8_t.

I hope that is true.  But there are platforms where uint8_t is not 
available because uint_least8_t is larger than 8 bits.

Tom.

>
> markus
>
>
> _______________________________________________
> SG16 Unicode mailing list
> Unicode at isocpp.open-std.org
> http://www.open-std.org/mailman/listinfo/unicode


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20181008/8e949c69/attachment.html 


More information about the Unicode mailing list