<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 10/17/19 3:02 PM, Mateusz Pusz
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CACkZv3HBQXn8xJfXQr-y0V8PmgwX_LiWtTgzNkLbkEboxQhrdw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Hi everyone,
<div><br>
</div>
<div>Right now I am in the process of designing and implementing
a Physical Units library that hopefully will be a start for
having such a feature in the C++ Standard Library. You can
find more info on the library here: <a
href="https://github.com/mpusz/units" moz-do-not-send="true">https://github.com/mpusz/units</a>.</div>
<div><br>
</div>
<div>Recently, I started to work on the text output of
quantities. Quantities consist of value and a unit symbol. The
latter is a perfect use case for Unicode. Consider:</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">10 us vs
10 μs</font></div>
<div><font color="#660000" face="monospace">2 kg*m/s^2 vs 2
kg⋅m/s²<br>
</font></div>
<div><br>
</div>
<div>Before C++20 we could get away with a hack by providing
Unicode characters to `char`-based types and streams, but with
the introduction of `char8_t` in C++20 it seems it will be a
bigger issue from now on. The library implementors will have
to provide 2 separate implementations:</div>
<div>1. For `char`-based types (string_view, ostream)
without Unicode signs</div>
<div>2. For Unicode char based types</div>
<div><br>
</div>
<div>However, there are a few issues here:</div>
<div>1. As of now, we do not have <font face="monospace">std::u8cout</font>
or even <font face="monospace">std::u8ostream</font>. So
there is really no easy way to create and use a stream for
Unicode characters. So even if I implement</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">template<class
CharT, class Traits><br>
friend std::basic_ostream<CharT, Traits>&
operator<<(std::basic_ostream<CharT,
Traits>& os, const quantity& q)<br>
</font></div>
<div><br>
</div>
<div>correctly, we do not have an easy way to use it.</div>
</div>
</blockquote>
Yes, this is a problem that we need to figure out how to solve;
ideally in the C++23 time frame.<br>
<blockquote type="cite"
cite="mid:CACkZv3HBQXn8xJfXQr-y0V8PmgwX_LiWtTgzNkLbkEboxQhrdw@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>2. In order to implement the above, I could imagine such an
interface for a symbol prefix:</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">template<typename
CharT, typename Traits, typename Prefix, typename Ratio><br>
inline constexpr std::basic_string_view<CharT, Traits>
prefix_symbol;<br>
</font></div>
<div><br>
</div>
<div>and its partial specializations for different
prefixes/ratios:</div>
<div><br>
</div>
<div><font color="#660000" face="monospace">template<typename
CharT, typename Traits></font></div>
<div><font color="#660000" face="monospace">inline constexpr
std::basic_string_view<char, Traits> </font><span
style="color:rgb(102,0,0);font-family:monospace">prefix_symbol<</span><span
style="color:rgb(102,0,0);font-family:monospace">char,
Traits, </span><span
style="color:rgb(102,0,0);font-family:monospace">si_prefix,
std::micro> = "u";</span></div>
<div><span style="color:rgb(102,0,0);font-family:monospace">template<typename
CharT, typename Traits></span><br>
</div>
<div><font color="#660000" face="monospace">inline constexpr
std::basic_string_view<CharT,
Traits> prefix_symbol<CharT, Traits, si_prefix,
std::micro> = u8"\u00b5"; // µ</font></div>
<div><font color="#660000" face="monospace">template<typename
CharT, typename Traits></font></div>
<div><font color="#660000" face="monospace">inline constexpr
std::basic_string_view<CharT,
Traits> prefix_symbol<CharT, Traits, si_prefix,
std::milli> = "m";</font></div>
<div><br>
</div>
<div>The problem is that the above code will not compile.
Specialization for all `CharT` will not be possible to be
initialized with a literal like "m". Also, there is no generic
mechanism to initialize all Unicode-based versions of the type
with the same literal as each of them requires a different
prefix (u8, u, U). Providing a specialization for every
character type here is going to be a nightmare for library
authors.</div>
<div><br>
</div>
<div>To solve the second problem fmt and chrono defined
something called STATICALLY-WIDEN (<a
href="http://wg21.link/time.general" moz-do-not-send="true">http://wg21.link/time.general</a>)
but it seems that it is more a specification hack rather than
the implementation technique. I call it a hack as it currently
addresses only `char` and `wchar_t` and does not mention
Unicode characters at all as of now.</div>
</div>
</blockquote>
The code would be ugly, but you could implement constexpr conversion
functions using the techniques used to implement the '_as_char' UDL
in
<a class="moz-txt-link-freetext" href="https://github.com/tahonermann/char8_t-remediation/blob/master/char8_t-remediation.h">https://github.com/tahonermann/char8_t-remediation/blob/master/char8_t-remediation.h</a>.<br>
<blockquote type="cite"
cite="mid:CACkZv3HBQXn8xJfXQr-y0V8PmgwX_LiWtTgzNkLbkEboxQhrdw@mail.gmail.com">
<div dir="ltr">
<div>Dear SG16 members, do you have any BKMs or suggestions on
how to write a library that is Unicode aware and safe in an
easy and approachable way? Should we strive to provide a
nice-looking representation of units for outputs that support
Unicode (console, files, etc) or should we, as ever before,
just support only `char` and `wchar_t` and ignore the
existence of Unicode in C++?</div>
</div>
</blockquote>
<p>It would be great if you could (continue to) bring us these use
cases and experiment to help us figure out how to solve these
problems.</p>
<p>One of our highest priorities right now is to get basic
transcoding support into the standard so that we can at least
better enable programs to relatively easily use Unicode as an
internal encoding. Then, a suite of text maintenance types are
next in order. Along the way we need to be thinking about the
program environment (as you are) and how we interact with it.
Unfortunately, we're having to basically build from the ground up.<br>
</p>
<p>Tom.<br>
</p>
<blockquote type="cite"
cite="mid:CACkZv3HBQXn8xJfXQr-y0V8PmgwX_LiWtTgzNkLbkEboxQhrdw@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>Please keep in mind that the library is hoped to target
C++23.</div>
<div><br>
</div>
<div>Best</div>
<div><br>
</div>
<div>Mat</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
SG16 Unicode mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Unicode@isocpp.open-std.org">Unicode@isocpp.open-std.org</a>
<a class="moz-txt-link-freetext" href="http://www.open-std.org/mailman/listinfo/unicode">http://www.open-std.org/mailman/listinfo/unicode</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>