1. Motivation
There are two types of casting in
- type-casting and bit-casting -
which would benefit from alternative forms which make them simpler to use, more
readable, and provide some extra utility which is otherwise unavailable.
1.1. Type casting
Type casting in
occurs when the programmer wants to change the type
(and therefore value) of the elements of a
without changing the
number of elements. The constructors in
allow this to be expressed
by a direct call to a constructor, or through a static_cast, but the type
definition of the destination must be setup in advance (e.g., using
). This can lead to code which is verbose, which in turn makes it
difficult the code difficult to read and understand quickly. We suggest that a
short-hand called
is provided to make it easy to create the new type
and perform the element-by-element cast. The following code shows two ways to write a function using existing facilities:
template < typename T , typename ABI > auto incrementAsFloat1 ( const basic_simd < T , ABI >& x ) { // Use constructor. return rebind_simd_t < float , basic_simd < T , ABI >> ( x ) + 1.0f ; } template < typename T , typename ABI > auto incrementAsFloat ( const basic_simd < T , ABI >& x ) { // Use static_cast using OUT = simd < float , basic_simd < T , ABI >:: size > ; return static_cast < OUT > ( x ) + 1.0f ; }
Note that there are other ways to write these, perhaps using extra aliases or intermediate temporary variables to improve readability. However, it is much cleaner to be able to write them like this instead:
template < typename T , typename ABI > auto incrementAsFloat ( const basic_simd < T , ABI >& x ) { return simd_cast < float > ( x ) + 1.0f ; }
The
variant is much more readable and it captures the intent of the
programmer very concisely.
Of course there are times when the type-conversion may also require a change in
ABI (e.g., to create a new element type with an alternative target type) in
which
wouldn’t suffice, but for the majority of the code we have
encountered the simpler
captures the programmer’s requirements.
Another good reason to use
is that it enables simd-generic code to
be written. It is desirable to be able to write a generic algorithm once,
initially using scalar types to get the code working, and then later
substituting with a
type. In such code we want a uniform way to cast from
the implementation value type to a new type without having to reflect upon
whether the type is simd or scalar. A variant of
can be provided
which works on scalar types or simd types. For example, the following code could
work with either in that case:
// simd-generic function auto incrementAsFloat ( auto x ) { return simd_cast < float > ( x ) + 1.0f ; } // Call with scalar: auto w1 = incrementAsFloat ( 23.f ); // Call with simd: auto w2 = incrementAsFloat ( simd < int > ( ptr ));
Finally, it is worth noting that while it would be straight-forward for
programmers to define
or an equivalent themselves, it is such a
common utility that it is better for it to be defined just once in
itself.
1.2. Bit-casting
The second type of casting operation is bit-casting, where the underlying bit
pattern is interpreted as though it were a different
value. In such
a conversion not only can the element type change, but the number of elements
could also change. For example, a
could be bit-cast into
to access the individual bytes of the original.
The existing
function already allows a value of a
type to
be bit-cast into a different
type:
// Do something to a complex simd value ([[R2663R5]]). template < typename T , typename ABI > auto fn ( const basic_simd < std :: complex < T > , ABI >& x ) { // Setup a type to represent the raw floating point elements in the // complex simd types. This doubles the number of elements in the simd. constexpr int numNativeCmplxElements = simd < std :: complex < float >>:: size ; using AsFloat = simd < float , numNativeCmplxElements * 2 > ; // Do the bit-cast conversion to obtain the raw float elements. auto asT = std :: bit_cast < AsFloat > ( x ); auto result = ...; // e.g., call an Intel intrinsic like _mm512_fmsubadd_ps // Convert back to its original complex form. return std :: bit_cast < basic_simd < std :: complex < T > , ABI >> ( result ); }
This example shows the verbose mechanics of making that conversion. The correct
type needs to be created by calculating the appropriate number of new elements
for the bit-cast element type, and then creating a suitable type using those
elements, before
can be called. It would be more convenient to
be able to specify the new element type as part of a new
function:
template < typename T , typename ABI > auto fn ( const basic_simd < std :: complex < T > , ABI >& x ) { auto asT = simd_bit_cast < T > ( x ); auto result = ...; // e.g., call an Intel intrinsic like _mm512_fmsubadd_ps return simd_bit_cast < std :: complex < T >> ( result ); }
The
can also be overloaded for scalar types to allow
simd-generic programming. It will be equivalent to a
in those
circumstances.
Like
, the
is easily defined by the programmer if
they want it, but it is such a useful function that it
should be defined once in
for all programmers to use.
Note that no functions will be provided to bit-cast
types because
the underlying implementation of a mask can vary by target.
2. Implementation experience
In Intel’s implementation of
the
function was added
very early on because it is so widely used.
The implementation of Intel’s
itself uses
to make it
easier to interface to compiler intrinsics. Intrinsics often require particular
data types to be used to achieve certain effects, and the bit-cast allows the
underlying bits to be quickly and easily reinterpreted.
Intel uses
in a number of internal software projects, and some of
those (particularly wireless or packet-processing) need to be able to
easily reinterpret the underlying bits in different ways.
3. Wording
3.1. Add [simd.casts] to the synopsis
Add the following to the [simd.syn] section:
// [simd.copy], basic_simd cast functions template < typename To , typename From > constexpr To simd_cast ( const From & x ); template < typename To , typename From , typename Abi > constexpr rebind_simd_t < To , basic_simd < From , Abi >> simd_cast ( & x ); template < typename To , typename From > constexpr To simd_bit_cast ( const From & x ); template < typename To , typename From , typename Abi > constexpr simd < To , ( basic_simd < From , Abi >:: size * sizeof ( From )) / sizeof ( To ) > simd_bit_cast ( const basic_simd < From , Abi >& x );
3.2. Add new simd cast section [simd.casts]
�
casts [simd.casts]
basic_simd 1 ) template < typename To , typename From > constexpr To simd_cast ( const From & x ); 2 ) template < typename To , typename From , typename Abi > constexpr rebind_simd_t < To , basic_simd < From , Abi >> simd_cast ( & x ); Returns:
For the first overload, equivalent to returning
.
static_cast < To > ( x ) For the second overload, equivalent to returning
.
static_cast < rebind_simd_t < To , basic_simd < From , Abi >>> ( x ) 1 ) template < typename To , typename From > constexpr To simd_bit_cast ( const From & x ); 2 ) template < typename To , typename From , typename Abi > constexpr simd < To , ( basic_simd < From , Abi >:: size * sizeof ( From )) / sizeof ( To ) > simd_bit_cast ( const basic_simd < From , Abi >& x ); Returns:
For the first overload, equivalent to returning
.
std :: bit_cast < To > ( x ) For the second overload, equivalent to returning
.
std :: bit_cast < simd < To , ( basic_simd < From , Abi >:: size * sizeof ( From )) / sizeof ( To ) >> ( x )