<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto">Hi, Niall. See JeanHeyd’s <a href="http://wg21.link/p1629">http://wg21.link/p1629</a>. This is the direction SG16 is currently heading for encoding conversion functions. <div><br></div><div>JeanHeyd has an implementation available and it would be great to get feedback on how well it works for you!</div><div><br></div><div>(I agree codecvt is unfit for nearly all purposes)<br><br><div id="AppleMailSignature" dir="ltr">Tom.</div><div dir="ltr"><br>On Aug 29, 2019, at 6:57 AM, Niall Douglas <<a href="mailto:s_sourceforge@nedprod.com">s_sourceforge@nedprod.com</a>> wrote:<br><br></div><blockquote type="cite"><div dir="ltr"><span>As SG16 knows, I've been busy reworking path_view to meet your feedback.</span><br><span>I am finding std::codecvt to be a steaming pile of poo, and I was</span><br><span>wondering if anyone on SG16 plans to propose a more usable, modern,</span><br><span>Unicode library API?</span><br><span></span><br><span>I'm not looking for much here. I just want to do the following:</span><br><span></span><br><span>- UTF8 to UTF16</span><br><span>- UTF8 to narrow native encoding</span><br><span>- UTF8 to wide native encoding</span><br><span>- UTF16 to UTF8</span><br><span>- UTF16 to narrow native encoding</span><br><span>- UTF16 to wide native encoding</span><br><span>- narrow native encoding to UTF8</span><br><span>- narrow native encoding to UTF16</span><br><span>- narrow native encoding to wide native encoding</span><br><span>- wide native encoding to UTF8</span><br><span>- wide native encoding to UTF16</span><br><span>- wide native encoding to narrow native encoding</span><br><span></span><br><span>These match all the formats which filesystem::path can construct from.</span><br><span></span><br><span>I also want:</span><br><span></span><br><span>- Estimate of output buffer needed for some input buffer</span><br><span>- Lexicographic comparison as well as reencoding</span><br><span>- More choice for handling invalid UTF input than refusing to continue</span><br><span>e.g. replacement with space characters</span><br><span></span><br><span></span><br><span>As far as I can tell, std::codecvt can be beaten with a stick into</span><br><span>(sometimes very inefficiently) implementing most of the above. So the</span><br><span>desired functionality is present, just with an awful API which is</span><br><span>extremely prone to using incorrectly, as I can attest to.</span><br><span></span><br><span>What I'd much prefer is something simple, like:</span><br><span></span><br><span>template<class Char1T, class Char2T></span><br><span>int codecvt_compare(basic_string_view<Char1T> a,</span><br><span>basic_string_view<Char2T> b) noexcept;</span><br><span></span><br><span>And that's it for comparison. It should "just work".</span><br><span></span><br><span>For reencoding:</span><br><span></span><br><span>template<class DestT, class SrcT></span><br><span>struct codecvt_reencode;</span><br><span></span><br><span>And it would have a call operator, for feeding it more source data, so</span><br><span>conversion becomes looping call, handling surprise, until conversion is</span><br><span>done.</span><br><span></span><br><span></span><br><span>Before I go ahead and implement my new API, is there anything better</span><br><span>than codecvt I can use instead?</span><br><span></span><br><span>Thanks,</span><br><span>Niall</span><br><span></span><br><span>_______________________________________________</span><br><span>SG16 Unicode mailing list</span><br><span><a href="mailto:Unicode@isocpp.open-std.org">Unicode@isocpp.open-std.org</a></span><br><span><a href="http://www.open-std.org/mailman/listinfo/unicode">http://www.open-std.org/mailman/listinfo/unicode</a></span><br></div></blockquote></div></body></html>