<div dir="ltr">On Mon, Mar 12, 2018 at 2:36 PM, Lawrence Crowl <span dir="ltr"><<a href="mailto:Lawrence@crowl.org" target="_blank">Lawrence@crowl.org</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span class="gmail-">On 3/12/18, Myria <<a href="mailto:myriachan@gmail.com">myriachan@gmail.com</a>> wrote:<br>
> On Mon, Mar 12, 2018 at 13:32 Lawrence Crowl <<a href="mailto:Lawrence@crowl.org">Lawrence@crowl.org</a>> wrote:<br>
>> On 3/12/18, Myria <<a href="mailto:myriachan@gmail.com">myriachan@gmail.com</a>> wrote:<br>
>>> The severity of the current situation is that I generally avoid signed<br>
>>> integers if I intend to do any arithmetic on them whatsoever, lest the<br>
>>> compiler decide to make demons come out of my nose.<br>>>> And even then, I'm not safe:<br>
>>><br>
>>> std::uint16_t x = 0xFFFF;<br>
>>> x *= x; // undefined behavior on most modern platforms<br>
>><br>
>> How? The C++ standard defines unsigned arithmetic as<br>
>> modular arithmetic.<br>
><br>
> But that's the catch: it's double secret signed arithmetic. [...]<br>
> On a "typical modern platform", std::uint16_t is unsigned short. [...]<br>
> 65535 * 65535 overflows a signed int on a typical 32-bit int platform,<br>
> which is undefined behavior.<br>
<br>
</span>Good example.<br></blockquote><div><br></div><div>Yes.</div><div>I have now added `uint16_t(65535) * uint16_t(65535)` as a row in the second table in <a href="https://quuxplusone.github.io/draft/twosc-conservative.html">https://quuxplusone.github.io/draft/twosc-conservative.html</a>. Highly unfortunately, my "conservative" two's-complement idea would not fix it, because multiplication is an arithmetic operation (not a bitwise operation) and by the time the operation is happening, the standard integral promotions have already kicked in, so the multiplication is happening on signed quantities.</div><div>With JF Bastien's two's-complement proposal P0907R0, the multiplication would take place in signed int as if -fwrapv were in effect, producing a well-defined answer of `int(-131071)`. This is still the "wrong type," but converting it back down to uint16_t is guaranteed to have the expected effect even in present-day C++.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span class="gmail-">
>> More importantly, what happens to your program when x*x < x?<br>
><br>
> The code that led me to finding this was a 16-bit variant of the FNV<br>
> hash function, so it worked properly after the correct casts were added<br>
> to allow the wrap.<br>
<br>
</span>So the application intended modular arithmetic? I was concerned about<br>
the normal case where 'unsigned' is used to constrain the value range,<br>
not to get modular arithmetic.<br></blockquote><div><br></div><div>IMNSHO, if anyone is using unsigned types "to constrain the value range," they are doing computers wrong. That is <i>not</i> what signed vs. unsigned types are for.</div><div><br></div><div><span style="font-size:12.800000190734863px">As Lawrence himself wrote earlier in this thread:</span></div><div><span style="font-size:12.800000190734863px">> If integer overflow is undefined behavior, then it is wrong. Tools can </span><span style="font-size:12.800000190734863px">detect wrong programs and report them.</span><br></div><div><span style="font-size:12.800000190734863px">The contrapositive is: </span><span style="font-size:12.800000190734863px">"If the programmer is using a type where integer overflow is well-defined to wrap, then we can assume that the program relies on that wrapping behavior (because there would </span><span style="font-size:12.800000190734863px">otherwise </span><span style="font-size:12.800000190734863px">be a strong incentive for the programmer to use a type that detects and reports unintended overflow)."</span></div><div><span style="font-size:12.800000190734863px"><br></span></div><div><br></div><div>The original design for the STL contained the "unsigned for value range" antipattern. Consequently, they ran into trouble immediately: for example, `<a href="https://en.cppreference.com/w/cpp/string/basic_string/find">std::string::find</a>` returns an index into the string, naturally of type `std::string::size_type`. But size_type is unsigned! So instead of returning "negative 1" to indicate the "not found" case, they had to make it return `size_type(-1)`, a.k.a. `std::string::npos` — which is a positive value! This means that callers have to write cumbersome things such as</div><div><br></div><div> if (s.find('k') != std::string::npos)</div><div><br></div><div>where it would be more natural to write</div><div><br></div><div> if (s.find('k') >= 0)</div><div><br></div><div>This is sort of parallel to my quotation of Lawrence above: If every possible value in the domain of a given type is a valid output (e.g. from `find`), then there is no value left over with which the function can signal failure at runtime. And if every possible value in the domain is a valid <i>input</i> (e.g. to `malloc`), then there is no way for the function to detect incorrect input at runtime.</div><div><br></div><div>If it weren't for the STL's `size_type` snafu continually muddying the waters for new learners, I doubt people would be falling into the "unsigned for value range" antipattern anymore.</div><div><br></div><div>–Arthur</div></div></div></div>