[ub] Aliasing char16_t with int_least16_t, etc.
Jeffrey Yasskin
jyasskin at google.com
Wed Oct 30 18:14:04 CET 2013
I was sent a code review today that wanted to pass an array of wchar_t
(sizeof(wchar_t)==2 on Windows) to a function taking const uint16_t*
(https://code.google.com/p/chromium/codesearch/#chromium/src/third_party/harfbuzz-ng/src/hb-buffer.cc&l=982).
The proposed code did this with "reinterpret_cast<const
uint16_t*>(the_wchar_t_pointer)", but I had to point out that this
violates [basic.lval]p10. The workarounds seem to involve either
copying the array or adding overloads to the function that pass
through to a template.
Can we make this sort of aliasing defined instead? With 2-3 ways to
represent a utf-16 array, we're likely to see more undefined casting
as users try to avoid extra copies or perceived code bloat.
I think the change would be to add some bullets in [basic.lval]p10:
* [a type that is] the (possibly cv-qualified) underlying type of the
dynamic type of the object,
* [a type that is] the (possibly cv-qualified) signed or unsigned type
corresponding to the underlying type of the dynamic type of the
object,
Would we want to go the other way too? That is, do we want to force
everyone writing a flexible utf-16 function to take uint16_t, or could
they accept char16_t too? If we want to let them take char16_t, we'd
need to add:
* a (possibly cv-qualifed) type whose underlying type is the dynamic
type of the object
* a (possibly cv-qualifed) type whose underlying type is the signed or
unsigned type corresponding to the dynamic type of the object
Jeffrey
More information about the ub
mailing list