<div dir="ltr"><div dir="ltr">On Tue, Jun 4, 2019 at 5:39 AM Lyberta <<a href="mailto:lyberta@lyberta.net">lyberta@lyberta.net</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">We can always modify the standard so that we get strong types via<br>
compiler magic. I was thinking:<br>
<br>
utf8'a' -> std::unicode::utf8_code_unit<br>
utf16'a' -> std::unicode::utf16_code_unit<br>
utf32'a' -> std::unicode::utf32_code_unit<br>
utf8"a" -> std::unicode::utf8_code_unit_sequence_view<br>
utf16"a" -> std::unicode::utf16_code_unit_sequence_view<br>
utf32"a" -> std::unicode::utf32_code_unit_sequence_view<br>
<br>
Well, that's future. I want something I can use now.<br>
<br>
Also, does the standard require well formed sequences in literals?<br></blockquote><div><br></div><div>
No, we lobbied specifically that you can insert "ill-formed" sequences
(e.g., not perfectly well formed Unicode Scalar Values) into string
literals. This is specifically to enable people who need literals of
types that are not exactly conformant for various reasons (testing, or
specifically creating WTF8/CESU8/etc. literals, and more). <br><br></div><div>Granted, the only way you can do this is by writing `\x` values specifically in the string literal: it's a very powerful show that someone is doing something non-standard. That doesn't mean you can't assume char8_t, char16_t, and char32_t are not well-formed: if someone's shoving in direct code unit values with backslash-X syntax, you have to assume they are a Very Smart Person Who Knows What They Are Getting Themselves Into.<br></div></div></div>