<p dir="ltr">I think I agree with everything in Ion's email, especially that the ability to detect padding bits is useful, and that banning CHAR_BIT>8 is probably a bad idea.</p>
<p dir="ltr">One wrinkle in the goal to standardize 2's complement is the ability to reinterpret the bytes of a negative integer as an array of chars: that seems harder to emulate, and possibly less needed to allow programs to have portable behavior. If we have functions to serialize and deserialize as 2's-complement byte arrays, we may not need the ability to memcpy as them. 2's-complement behavior in conversions and bitwise operations may be enough.</p>
<div class="gmail_quote">On Oct 27, 2013 3:46 PM, "Ion Gaztaņaga" <<a href="mailto:igaztanaga@gmail.com">igaztanaga@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
El 27/10/2013 18:12, Jeffrey Yasskin escribiķ:<br>
> And AFAICS they didn't bother to implement a C++ compiler at all,<br>
> indicating to me that the niche for C is that of being easy to<br>
> implement, not that of supporting more machines (since it _doesn't_<br>
> support the efficient mode for this machine).<br>
><br>
> If we make the C++ definition stricter, either unusual machines will<br>
> keep implementing just C because it's still easier, or they'll<br>
> implement a non-conforming mode for C++ as the default and a<br>
> conforming mode as a switch, just like they do for C.<br>
<br>
I think we have two separate issues here. We might have a different<br>
answer to each question:<br>
<br>
1) One's complement, sign-magnitude / padding bits<br>
<br>
Non two's complement (with/without padding bits) machines seem to be old<br>
architectures that have survived until now in some sectors (government,<br>
financial, health) where backwards compatibility with old mainframes is<br>
important.<br>
<br>
Unisys was formed through a merger of mainframe corporations<br>
Sperry-Univac and Burroughs, so Clearpath systems are available in two<br>
variants: a UNISYS 2200-based system (Sperry, one's complement 36 bit<br>
machines) or an MCP-based system (Burroughs, sign-magnitude 48/8 bit<br>
machines). According to their website, new Intel based mainframes are<br>
being designed (so they'd need to emulate 1's complement /<br>
sign-magnitude behaviour though the compiler). They have no C++ compiler<br>
and Java is executed with additional 2's complement emulation code. They<br>
are migrating from custom ASICs to Intel processors<br>
(<a href="http://www.theregister.co.uk/2011/05/10/unisys_clearpath_mainframe/" target="_blank">http://www.theregister.co.uk/2011/05/10/unisys_clearpath_mainframe/</a>) so<br>
2's complement will be faster in newer mainframes than 1's complement.<br>
<br>
I think requiring 2's complement in the long term would be a good idea,<br>
even in C, as no new architecture is using other representation and this<br>
simplifies teaching and programming in C/C++. We could start having a<br>
ISO C macro (for C compatibility) to detect 2's complement at compile<br>
time and deprecate 1's complement a sign-magnitude representations for<br>
C++. If no one objects then only 2's complement could be allowed for the<br>
next standard.<br>
<br>
It would be interesting to have more guarantees on 2's complement<br>
systems (say no padding bits, other than in bool), but I don't know if<br>
that would be possible as I think there are Cray machines with padding<br>
bits in short/int pointers types:<br>
<br>
<a href="http://docs.cray.com/books/004-2179-001/html-004-2179-001/rvc5mrwh.html#QEARLRWH" target="_blank">http://docs.cray.com/books/004-2179-001/html-004-2179-001/rvc5mrwh.html#QEARLRWH</a><br>
<br>
At least it would be interesting to have a simple way to detect types<br>
with padding bits.<br>
<br>
2) CHAR_BITS > 8<br>
<br>
Architectures with CHAR_BIT > 8 are being designed these days and they<br>
have a very good reason to support only word (16-24-32 bit) based types:<br>
performance. Word-multiple memory accesses and operands simplify the<br>
design, speed-up and allow bigger caches and arithmetic units, they<br>
allow fetching several instructions and operands in parallel more easily<br>
and use every transistor to do what a DSP is supposed to do: very<br>
high-speed data processing.<br>
<br>
These DSPs have modern C++ compilers (VisualDSP++ 5.0 C/C++ Compiler<br>
Manual for SHARC Processors,<br>
<a href="http://www.analog.com/static/imported-files/software_manuals/50_21k_cc_man.rev1.1.pdf" target="_blank">http://www.analog.com/static/imported-files/software_manuals/50_21k_cc_man.rev1.1.pdf</a>).<br>
<br>
"Analog Devices does not support data sizes smaller than the addressable<br>
unit size on the processor. For the ADSP-21xxx processors, this means<br>
that both short and char have the same size as int. Although 32-bit<br>
chars are unusual, they do conform to the standard"<br>
<br>
"All the standard features of C++ are accepted in the default mode<br>
except exception handling and run-time type identification because these<br>
impose a run-time overhead that is not desirable for all embedded<br>
programs. Support for these features can be enabled with the -eh and<br>
-rtti switches."<br>
<br>
In DSPs that can be configured in byte-addressing mode (instead of the<br>
default word-addressing mode) stdint.h types are accordingly defined<br>
(int8_t and friends only exist in in byte addressing mode). Example:<br>
TigerShard DSPs (VisualDSP++ for TigerSharc processors:<br>
<a href="http://www.analog.com/static/imported-files/software_manuals/50_ts_cc_man.4.1.pdf" target="_blank">http://www.analog.com/static/imported-files/software_manuals/50_ts_cc_man.4.1.pdf</a>).<br>
Even pointer implementations are optimized for Word-addressing (taken<br>
from the C compiler manual):<br>
<br>
"Pointers<br>
<br>
The pointer representation uses the low-order 30 bits to address the<br>
word and the high-order two bits to address the byte within the word.<br>
Due to the pointer implementation, the address range in byte-addressing<br>
mode is 0x00000000 to 0x3FFFFFFF.<br>
<br>
The main advantage of using the high-order bits to address the bytes<br>
within the word as opposed to using the low-order bits is that all<br>
pointers that address word boundaries are compatible with existing code.<br>
This choice means there is no performance loss when accessing 32-bit items.<br>
<br>
A minor disadvantage with this representation is that address arithmetic<br>
is slower than using low-order bits to address the bytes within a word<br>
when the computation might involve part-word offsets."<br>
<br>
I think banning or deprecating systems with CHAR_BIT != 8 would be a<br>
very bad idea as C++ is a natural choice for high-performance<br>
data/signal processors.<br>
<br>
Best,<br>
<br>
Ion<br>
_______________________________________________<br>
ub mailing list<br>
<a href="mailto:ub@isocpp.open-std.org">ub@isocpp.open-std.org</a><br>
<a href="http://www.open-std.org/mailman/listinfo/ub" target="_blank">http://www.open-std.org/mailman/listinfo/ub</a><br>
</blockquote></div>