[ub] [c++std-ext-14592] Re: Re: Sized integer types and char bits

Ion Gaztañaga igaztanaga at gmail.com
Sun Oct 27 23:46:35 CET 2013


El 27/10/2013 18:12, Jeffrey Yasskin escribió:
> And AFAICS they didn't bother to implement a C++ compiler at all,
> indicating to me that the niche for C is that of being easy to
> implement, not that of supporting more machines (since it _doesn't_
> support the efficient mode for this machine).
>
> If we make the C++ definition stricter, either unusual machines will
> keep implementing just C because it's still easier, or they'll
> implement a non-conforming mode for C++ as the default and a
> conforming mode as a switch, just like they do for C.

I think we have two separate issues here. We might have a different 
answer to each question:

1) One's complement, sign-magnitude / padding bits

Non two's complement (with/without padding bits) machines seem to be old 
architectures that have survived until now in some sectors (government, 
financial, health) where backwards compatibility with old mainframes is 
important.

Unisys was formed through a merger of mainframe corporations 
Sperry-Univac and Burroughs, so Clearpath systems are available in two 
variants: a UNISYS 2200-based system (Sperry, one's complement 36 bit 
machines) or an MCP-based system (Burroughs, sign-magnitude 48/8 bit 
machines). According to their website, new Intel based mainframes are 
being designed (so they'd need to emulate 1's complement / 
sign-magnitude behaviour though the compiler). They have no C++ compiler 
and Java is executed with additional 2's complement emulation code. They 
are migrating from custom ASICs to Intel processors 
(http://www.theregister.co.uk/2011/05/10/unisys_clearpath_mainframe/) so 
2's complement will be faster in newer mainframes than 1's complement.

I think requiring 2's complement in the long term would be a good idea, 
even in C, as no new architecture is using other representation and this 
simplifies teaching and programming in C/C++. We could start having a 
ISO C macro (for C compatibility) to detect 2's complement at compile 
time and deprecate 1's complement a sign-magnitude representations for 
C++. If no one objects then only 2's complement could be allowed for the 
next standard.

It would be interesting to have more guarantees on 2's complement 
systems (say no padding bits, other than in bool), but I don't know if 
that would be possible as I think there are Cray machines with padding 
bits in short/int pointers types:

http://docs.cray.com/books/004-2179-001/html-004-2179-001/rvc5mrwh.html#QEARLRWH

At least it would be interesting to have a simple way to detect types 
with padding bits.

2) CHAR_BITS > 8

Architectures with CHAR_BIT > 8 are being designed these days and they 
have a very good reason to support only word (16-24-32 bit) based types: 
performance. Word-multiple memory accesses and operands simplify the 
design, speed-up and allow bigger caches and arithmetic units, they 
allow fetching several instructions and operands in parallel more easily 
and use every transistor to do what a DSP is supposed to do: very 
high-speed data processing.

These DSPs have modern C++ compilers (VisualDSP++ 5.0 C/C++ Compiler 
Manual for SHARC Processors, 
http://www.analog.com/static/imported-files/software_manuals/50_21k_cc_man.rev1.1.pdf).

"Analog Devices does not support data sizes smaller than the addressable 
unit size on the processor. For the ADSP-21xxx processors, this means 
that both short and char have the same size as int. Although 32-bit 
chars are unusual, they do conform to the standard"

"All the standard features of C++ are accepted in the default mode 
except exception handling and run-time type identification because these 
impose a run-time overhead that is not desirable for all embedded 
programs. Support for these features can be enabled with the -eh and 
-rtti switches."

In DSPs that can be configured in byte-addressing mode (instead of the 
default word-addressing mode) stdint.h types are accordingly defined 
(int8_t and friends only exist in in byte addressing mode). Example: 
TigerShard DSPs (VisualDSP++ for TigerSharc processors: 
http://www.analog.com/static/imported-files/software_manuals/50_ts_cc_man.4.1.pdf). 
Even pointer implementations are optimized for Word-addressing (taken 
from the C compiler manual):

"Pointers

The pointer representation uses the low-order 30 bits to address the 
word and the high-order two bits to address the byte within the word. 
Due to the pointer implementation, the address range in byte-addressing 
mode is 0x00000000 to 0x3FFFFFFF.

The main advantage of using the high-order bits to address the bytes 
within the word as opposed to using the low-order bits is that all 
pointers that address word boundaries are compatible with existing code. 
This choice means there is no performance loss when accessing 32-bit items.

A minor disadvantage with this representation is that address arithmetic 
is slower than using low-order bits to address the bytes within a word 
when the computation might involve part-word offsets."

I think banning or deprecating systems with CHAR_BIT != 8 would be a 
very bad idea as C++ is a natural choice for high-performance 
data/signal processors.

Best,

Ion


More information about the ub mailing list