<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 10/19/19 12:53 PM, Victor Zverovich
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CANawtxYGfT==_dC81LBKXvJx1rsgurRbN_GcBF-jk0oH4p3Rhg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>STATICALLY-WIDEN is not a hack, it's just a convenient
pseudo-function that simplifies and formalizes a bunch of
wording originated from p0355. In case of chrono it will work
identically for Unicode and the only reason we didn't mention
charX_t is that neither std::format nor iostreams support
those. As Corentin wrote you could provide formatter
specializations including for Unicode types and once we have
charX_t overloads of std::format (I plan to propose these for
C++23) it will just pick up those specialization. In your case
STATICALLY-WIDEN won't help because you have different
representation of units for different character types.</div>
</div>
</blockquote>
<p>I think STATICALLY-WIDEN will work for his use case as he intends
to provide distinct partial specializations for char, wchar_t, and
charN_t. Basically, Mat needs a STATICALLY-WIDEN-UTF variant that
takes a string literal of some kind (presumably UTF-8) and
"widens" it to UTF-16 or UTF-32. Note the char partial
specialization below (that I think should not be parameterized on
CharT, just Traits).<br>
</p>
<p>What isn't clear to me is how implementors will implement
STATICALLY-WIDEN. Victor, do you know what techniques
implementors are expected to employ?</p>
<p>Tom.<br>
</p>
<blockquote type="cite"
cite="mid:CANawtxYGfT==_dC81LBKXvJx1rsgurRbN_GcBF-jk0oH4p3Rhg@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>Cheers,</div>
<div>Victor<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, Oct 17, 2019 at 12:21
PM Mateusz Pusz <<a href="mailto:mateusz.pusz@gmail.com"
moz-do-not-send="true">mateusz.pusz@gmail.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Hi everyone,
<div><br>
</div>
<div>Right now I am in the process of designing and
implementing a Physical Units library that hopefully will
be a start for having such a feature in the C++ Standard
Library. You can find more info on the library here: <a
href="https://github.com/mpusz/units" target="_blank"
moz-do-not-send="true">https://github.com/mpusz/units</a>.</div>
<div><br>
</div>
<div>Recently, I started to work on the text output of
quantities. Quantities consist of value and a unit symbol.
The latter is a perfect use case for Unicode. Consider:</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">10 us vs
10 μs</font></div>
<div><font color="#660000" face="monospace">2 kg*m/s^2 vs
2 kg⋅m/s²<br>
</font></div>
<div><br>
</div>
<div>Before C++20 we could get away with a hack by providing
Unicode characters to `char`-based types and streams, but
with the introduction of `char8_t` in C++20 it seems it
will be a bigger issue from now on. The library
implementors will have to provide 2 separate
implementations:</div>
<div>1. For `char`-based types (string_view, ostream)
without Unicode signs</div>
<div>2. For Unicode char based types</div>
<div><br>
</div>
<div>However, there are a few issues here:</div>
<div>1. As of now, we do not have <font face="monospace">std::u8cout</font>
or even <font face="monospace">std::u8ostream</font>. So
there is really no easy way to create and use a stream for
Unicode characters. So even if I implement</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">template<class
CharT, class Traits><br>
friend std::basic_ostream<CharT, Traits>&
operator<<(std::basic_ostream<CharT,
Traits>& os, const quantity& q)<br>
</font></div>
<div><br>
</div>
<div>correctly, we do not have an easy way to use it.</div>
<div><br>
</div>
<div>2. In order to implement the above, I could imagine
such an interface for a symbol prefix:</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">template<typename
CharT, typename Traits, typename Prefix, typename
Ratio><br>
inline constexpr std::basic_string_view<CharT,
Traits> prefix_symbol;<br>
</font></div>
<div><br>
</div>
<div>and its partial specializations for different
prefixes/ratios:</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">template<typename
CharT, typename Traits></font></div>
<div><font color="#660000" face="monospace">inline constexpr
std::basic_string_view<char, Traits> </font><span
style="color:rgb(102,0,0);font-family:monospace">prefix_symbol<</span><span
style="color:rgb(102,0,0);font-family:monospace">char,
Traits, </span><span
style="color:rgb(102,0,0);font-family:monospace">si_prefix,
std::micro> = "u";</span></div>
<div><span style="color:rgb(102,0,0);font-family:monospace">template<typename
CharT, typename Traits></span><br>
</div>
<div><font color="#660000" face="monospace">inline constexpr
std::basic_string_view<CharT,
Traits> prefix_symbol<CharT, Traits, si_prefix,
std::micro> = u8"\u00b5"; // µ</font></div>
<div><font color="#660000" face="monospace">template<typename
CharT, typename Traits></font></div>
<div><font color="#660000" face="monospace">inline constexpr
std::basic_string_view<CharT,
Traits> prefix_symbol<CharT, Traits, si_prefix,
std::milli> = "m";</font></div>
<div><br>
</div>
<div>The problem is that the above code will not compile.
Specialization for all `CharT` will not be possible to be
initialized with a literal like "m". Also, there is no
generic mechanism to initialize all Unicode-based versions
of the type with the same literal as each of them requires
a different prefix (u8, u, U). Providing a specialization
for every character type here is going to be a nightmare
for library authors.</div>
<div><br>
</div>
<div>To solve the second problem fmt and chrono defined
something called STATICALLY-WIDEN (<a
href="http://wg21.link/time.general" target="_blank"
moz-do-not-send="true">http://wg21.link/time.general</a>)
but it seems that it is more a specification hack rather
than the implementation technique. I call it a hack as it
currently addresses only `char` and `wchar_t` and does not
mention Unicode characters at all as of now.</div>
<div><br>
</div>
<div>Dear SG16 members, do you have any BKMs or suggestions
on how to write a library that is Unicode aware and safe
in an easy and approachable way? Should we strive to
provide a nice-looking representation of units for outputs
that support Unicode (console, files, etc) or should we,
as ever before, just support only `char` and `wchar_t` and
ignore the existence of Unicode in C++?</div>
<div><br>
</div>
<div>Please keep in mind that the library is hoped to target
C++23.</div>
<div><br>
</div>
<div>Best</div>
<div><br>
</div>
<div>Mat</div>
</div>
_______________________________________________<br>
SG16 Unicode mailing list<br>
<a href="mailto:Unicode@isocpp.open-std.org" target="_blank"
moz-do-not-send="true">Unicode@isocpp.open-std.org</a><br>
<a href="http://www.open-std.org/mailman/listinfo/unicode"
rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.open-std.org/mailman/listinfo/unicode</a><br>
</blockquote>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
SG16 Unicode mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Unicode@isocpp.open-std.org">Unicode@isocpp.open-std.org</a>
<a class="moz-txt-link-freetext" href="http://www.open-std.org/mailman/listinfo/unicode">http://www.open-std.org/mailman/listinfo/unicode</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>