<div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Aug 13, 2019, 10:34 PM <<a href="mailto:keld@keldix.com">keld@keldix.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">For most programs there is no default execution character set nor default<br>
execution encoding. A binary program is designed to run with the run time<br>
execution character set of the locale it runs with. So the same binary <br>
řogram can run with a Japanese encoding or a Danish enoding or arabic encoding.<br>
There is no knowledge at compilation time what encoding will be used at run time<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">The standard assumes there is one. It has to. You cannot not have an encoding.</div><div dir="auto">(Of course it is broken but it's a very old assumption).</div><div dir="auto"><br></div><div dir="auto">Also there is no such thing as a Danish encoding or a Japanese encoding. There is a Danish locale and an encoding attached to that locale (utf8, iso 8859). The standard doesn't always makes the distinction - it should)</div><div dir="auto"><br></div><div dir="auto">But yeah, all of that precludes people to have non ASCII in there source as this is currently the only thing that will work portably.</div><div dir="auto"><br></div><div dir="auto">This is not inherent to C++ which is one reason other languages converged to utf8 as the default/only encoding.</div><div dir="auto">(The primary reason being the Unicode character set is actually useful to store text)</div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
keld<br>
<br>
On Tue, Aug 13, 2019 at 04:10:29PM -0400, Steve Downey wrote:<br>
> Getting back to the original question. I think execution character set and<br>
> execution encoding would refer to the encoding specified by the default<br>
> locale, the "C" locale. We do not change the execution encoding via calls<br>
> to setlocale(), we change the global default locale to a new locale.<br>
> <br>
> Any name is going to be confusing. I think it's better to just get an<br>
> explicit definition to go together with the term. Something like that the<br>
> execution encoding is the same as the default character set associated with<br>
> the default "C" locale, and that it is IF NDR if the actual default<br>
> character set is different than the presumed translation from source<br>
> encoding to execution encoding, or if translation units with different<br>
> execution encodings are linked together. IF NDR because I don't see how it<br>
> could always be detected but it can quickly turn into ODR violations where<br>
> the same named object has different definitions.<br>
> <br>
> On Tue, Aug 13, 2019 at 1:22 PM Corentin <<a href="mailto:corentin.jabot@gmail.com" target="_blank" rel="noreferrer">corentin.jabot@gmail.com</a>> wrote:<br>
> <br>
> ><br>
> ><br>
> > On Tue, Aug 13, 2019, 7:08 PM Thiago Macieira <<a href="mailto:thiago@macieira.org" target="_blank" rel="noreferrer">thiago@macieira.org</a>> wrote:<br>
> ><br>
> >> On Tuesday, 13 August 2019 09:55:07 PDT Corentin wrote:<br>
> >> > (if anyone is thinking about that, I don't recommend it. You're going<br>
> >> to run<br>
> >> > into size limits: ICC at 512kB and MSVC at 256kB. Use something like<br>
> >> xxd -i<br>
> >> > to generate a brace-delimited array instead)<br>
> >> ><br>
> >> > Afaik that works if you use \x to escape every byte otherwise some<br>
> >> > implementation will mess with your data. Nothing is guaranteed to be<br>
> >> > passthrough otherwise<br>
> >><br>
> >> That would be ideal, but the problem I had was the unavailability of<br>
> >> proper<br>
> >> tools to convert the input into a form that the C++ compiler could<br>
> >> consume. I<br>
> >> was trying to do with a simple concatenation of a header, data, and<br>
> >> footer.<br>
> >><br>
> >> The end result is a shell script, a Perl script and a powershell script:<br>
> >> <a href="https://codereview.qt-project.org/c/qt/qtbase/+/263548" rel="noreferrer noreferrer" target="_blank">https://codereview.qt-project.org/c/qt/qtbase/+/263548</a><br>
> ><br>
> ><br>
> > Interesting ! std::embed could be useful there (we are going a bit off<br>
> > script). Some kind of raw bytes literals or an implementation that would<br>
> > optimize parsing arrays of literals such that it is as efficient at compile<br>
> > time as strings would also be nice.<br>
> ><br>
> >><br>
> >> --<br>
> >> Thiago Macieira - thiago (AT) <a href="http://macieira.info" rel="noreferrer noreferrer" target="_blank">macieira.info</a> - thiago (AT) <a href="http://kde.org" rel="noreferrer noreferrer" target="_blank">kde.org</a><br>
> >> Software Architect - Intel System Software Products<br>
> >><br>
> >><br>
> >><br>
> >> _______________________________________________<br>
> > SG16 Unicode mailing list<br>
> > <a href="mailto:Unicode@isocpp.open-std.org" target="_blank" rel="noreferrer">Unicode@isocpp.open-std.org</a><br>
> > <a href="http://www.open-std.org/mailman/listinfo/unicode" rel="noreferrer noreferrer" target="_blank">http://www.open-std.org/mailman/listinfo/unicode</a><br>
> ><br>
<br>
> _______________________________________________<br>
> SG16 Unicode mailing list<br>
> <a href="mailto:Unicode@isocpp.open-std.org" target="_blank" rel="noreferrer">Unicode@isocpp.open-std.org</a><br>
> <a href="http://www.open-std.org/mailman/listinfo/unicode" rel="noreferrer noreferrer" target="_blank">http://www.open-std.org/mailman/listinfo/unicode</a><br>
<br>
</blockquote></div></div></div>