<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>SG16 is seeking input from Swift and WebKit representatives to
help inform our work towards enhancing support for Unicode in the
C++ standard. In particular, we recognize the significant amount
of effort that went into the design of the Swift String type and
would like to better understand the motivations that contributed
to its current design and any pressures that might encourage
further evolution or refinement; especially for any concerns that
would be deemed significant enough to warrant backward
incompatible changes.<br>
</p>
<p>Though most of these questions specifically mention Swift, that
is an artifact of our being more familiar with Swift than the
internal workings of WebKit. Many of these questions would be
applicable to any string type designed to support Unicode. We are
therefore also interested in hearing about the string types used
by WebKit, the motivations that guided their design, and the trade
offs that have been made. Of particular interest would be the
results of design decisions that are contrast with the design of
Swift's String type.</p>
<p>Thank you in advance for any time and expertise you are willing
and able to share with us.<br>
</p>
<ol>
<li>The Swift string manifesto is about 1 1/2 years old. What have
you learned since writing it? What would you change? What have
you changed?<br>
</li>
<li>Swift strings are extended grapheme cluster (EGC) based. What
have been the best and worst consequences of this choice?</li>
<li>When porting code unit or code point based code to Swift
strings (e.g., when rewriting Objective-C code, or rewriting
Swift code to use String instead of NSString), has profiling
revealed performance regressions due to the switch to EGC based
processing? If so, what action was taken to correct it?<br>
</li>
<li>Swift strings do not enforce storage in any particular Unicode
normalization form. Was consideration given to forcing storage
in a particular form such as FCC or NFC?</li>
<li>Swift strings support comparison via normalization. Has use
of canonical string equality been a performance issue? Or been
a source of surprise to programmers?</li>
<li>Swift strings are not locale sensitive. Was any consideration
given to creation of a distinct locale sensitive string type?</li>
<li>Swift strings provide a count property as required to satisfy
the Collection protocol. How often do programmers use count
(the number of EGCs in the string) inappropriately?<br>
</li>
<li>Swift strings support several memory unsafe initializers and
methods. How frequently are these used incorrectly?</li>
<li>The Swift manifesto discussed three approaches to handling
substrings and Swift 4 changed from "same type, shared storage"
to "different type, shared storage". Any regrets?<br>
</li>
<li>How often do you find programmers doing work at the EGC level
that would be better performed at the code unit or code point
level?</li>
<li>Likewise, how often do you find programmers working with
unicodeScalars, utf8, or utf16 views to do work better performed
at the EGC level? For what reasons does this occur? Perhaps to
work around differences in EGC boundaries across Unicode
versions or the underlying version of ICU in use?</li>
<li>Has consideration been given to exposing Unicode character
database properties? CharacterSet exposes some of these
properties, but have more been requested?</li>
<li>How firmly is the Swift string implementation tied to ICU? If
the C++ standard library were to add suitable Unicode support,
what would motivate reimplementing Swift strings on top of it?</li>
<li>Do Swift programmers tend to prefer string interpolation or
string formatting functions?</li>
<li>What enhancements would you most like to see in C++ to improve
Unicode support?</li>
</ol>
These questions were culled from various internal SG16 discussions.
Special thanks to JeanHeyd Meneide, Mark Zeren, and Thiago Macieira
for their contributions to crafting this list.<br>
<br>
Tom.<br>
</body>
</html>