<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 9/13/19 10:35 AM, Victor Zverovich
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CANawtxYFoJ4xK8GFuYxaLOPRVU6RP85uJNgeCC1yD9dL7d-P7Q@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>I'll report back my findings in a paper. It may not be
solvable perfectly but I think we can come up with a good
practical approximation that addresses the main use case and
I'm fine with not addressing esoteric ones. People somehow
manage to write CLIs that do this and work with fancy emojis
and Asian scripts even in C =).<br>
</div>
</div>
</blockquote>
<p>Please make sure to address some of the more funny characters in
the paper. Here are a few examples, but I'm sure there are many
more.<br>
</p>
<ul>
<li>U+200B { ZERO WIDTH SPACE }</li>
<li>U+2063 { INVISIBLE SEPARATOR }</li>
<li>U+2064 { INVISIBLE PLUS }<br>
</li>
<li>Half and full width characters</li>
<li>Family emoji<br>
</li>
</ul>
<p>I tried an experiment a little while back. I thought it would be
fun to take Eric Niebler's range-v3 calendar example
(<a class="moz-txt-link-freetext" href="https://github.com/ericniebler/range-v3/blob/master/example/calendar.cpp">https://github.com/ericniebler/range-v3/blob/master/example/calendar.cpp</a>)
and modify it to generate emoji for some holidays. I didn't
actually go so far as to modify his code, but rather just did a
simple hack to test output to a terminal.</p>
<p><tt>$ cat cal.cpp </tt><tt><br>
</tt><tt>#include <iostream></tt><tt><br>
</tt><tt>#include <locale></tt><tt><br>
</tt><tt>int main() {</tt><tt><br>
</tt><tt> std::setlocale(LC_ALL, "");</tt><tt><br>
</tt><tt> std::cout <<</tt><tt><br>
</tt><tt> " October November
December\n"</tt><tt><br>
</tt><tt> " 1 2 3 1 2 3 4 5 6 7
1 2 3 4 5\n"</tt><tt><br>
</tt><tt> " 4 5 6 7 8 9 10 8 9 10 11 12 13 14 6 7
8 9 10 11 12\n"</tt><tt><br>
</tt><tt> " 11 12 13 14 15 16 17 15 16 17 18 19 20 21 13 14
15 16 17 18 19\n"</tt><tt><br>
</tt><tt> " 18 19 20 21 22 23 24 22 23 24 25 \xF0\x9F\xA6\x83
27 28 20 21 22 23 24 \xF0\x9F\x8E\x84 26\n"</tt><tt><br>
</tt><tt> " 25 26 27 28 29 30 \xF0\x9F\x8E\x83 29
30 27 28 29 30 31\n";</tt><tt><br>
</tt><tt>}</tt><br>
</p>
<p>Here is what konsole on Ubuntu 18.04 displays for me today:</p>
<p><img src="cid:part1.A538BC1B.F9A79CF4@honermann.net" alt=""></p>
<p>I find it interesting that misalignment is not consistent even
when font support is not present.</p>
<p>I wasn't able to get font fallback working in the time I allotted
to this. The only way I could get emoji to appear was to install
the "fonts-noto-color-emoji" package and then change konsole's
font to select it. This is a proportional font, so of course
everything looks ridiculous.</p>
<p><img src="cid:part2.CAA371B2.215F0FCE@honermann.net" alt=""></p>
<p>Tom.<br>
</p>
<blockquote type="cite"
cite="mid:CANawtxYFoJ4xK8GFuYxaLOPRVU6RP85uJNgeCC1yD9dL7d-P7Q@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div>- Victor<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Sep 13, 2019 at 6:57
AM Niall Douglas <<a
href="mailto:s_sourceforge@nedprod.com"
moz-do-not-send="true">s_sourceforge@nedprod.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On
13/09/2019 14:36, Victor Zverovich wrote:<br>
>> Instead of inventing something in the abstract, a
good next step would<br>
>> be to figure out how (in UTF-8 mode) Apple Terminal,
Gnome Terminal,<br>
>> Konsole, and the new Windows Terminal determine how
many terminal<br>
>> display column a string takes. (I'm not
volunteering.)<br>
> <br>
> I'm volunteering to do this since improving handling of
width is already<br>
> on my TODO list for the fmt library.<br>
<br>
I'll be interested in what you come up with on this, as I
don't think<br>
this solvable.<br>
<br>
For example, imagine formatting into a file, and then that
file is<br>
rendered onto a console.<br>
<br>
Another example: imagine formatting into a clipboard, which on
Windows<br>
and POSIX might involve three or four renditions into
differing formats<br>
and encodings. Then the consumer of the clipboard chooses an
unknown one<br>
of those renditions, and reinterprets it in some unknown way
into a<br>
paste into some document.<br>
<br>
Personally speaking, I think the best course is to declare
codepoint or<br>
byte based formatting widths, and draw a line under it.<br>
<br>
Niall<br>
_______________________________________________<br>
SG16 Unicode mailing list<br>
<a href="mailto:Unicode@isocpp.open-std.org" target="_blank"
moz-do-not-send="true">Unicode@isocpp.open-std.org</a><br>
<a href="http://www.open-std.org/mailman/listinfo/unicode"
rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.open-std.org/mailman/listinfo/unicode</a><br>
</blockquote>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
SG16 Unicode mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Unicode@isocpp.open-std.org">Unicode@isocpp.open-std.org</a>
<a class="moz-txt-link-freetext" href="http://www.open-std.org/mailman/listinfo/unicode">http://www.open-std.org/mailman/listinfo/unicode</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>