1. Motivation
[P1928R7] introduced data parallel types to C++. It mostly provided operators
which worked on or with
types, but it also included overloads of
useful functions from other parts of C++ (e.g., sin, cos, abs). In this paper we
propose some other functions from standard C++ headers which should receive
overloads to work with
types. The list isn’t exhaustive, but
reflects those functions which are desirable to include.
2. Support for < bit >
The
header is part of the numerics library and provides utilities for
manipulating and querying the properties of integral values when treated as
collections of bits. The table below summarises the contents of
.
Name | Purpose | Proposed (Y/N) |
---|---|---|
| A type which indicates the endianness of scalar types. | N |
| reinterpret the object representation of one type as that of another | N |
| reverses the bytes in the given integer value | Y |
| checks if a number is an integral power of two | Y |
| finds the smallest integral power of two not less than the given value | Y |
| finds the largest integral power of two not greater than the given value | Y |
| finds the smallest number of bits needed to represent the given value | Y |
| computes the result of bitwise left-rotation | Y |
| computes the result of bitwise right-rotation | Y |
| counts the number of consecutive 0 bits, starting from the most significant bit | Y |
| counts the number of consecutive 1 bits, starting from the most significant bit | Y |
| counts the number of consecutive 0 bits, starting from the least significant bit | Y |
| counts the number of consecutive 1 bits, starting from the least significant bit | Y |
| counts the number of 1 bits in an unsigned integer | Y |
Of these types and functions, only the first two shouldn’t be handled by
:
-
indicates the endianess of a scalar type. A SIMD value with elements of the underlying scalar type will have the same properties and does not do a special handling inendian
.std :: simd -
should be handled differently forbit_cast
values, and a separate proposal forstd :: simd
will be provided.simd_bit_cast
All the other functions from
should be handled in
by
element-wise application of the function to each element of the SIMD value. Any
constraints and behaviours on the function will be applied at the SIMD value
level. For instance, if
participates in overload resolution only if the argument type satisfies
concept then the overload of
with
parameter has the same constraint for
.
One small modification to the behaviour of
for
is where the return
type differs to the input type. For example, the standard
header defines
some query functions as returning integer values:
template < class T > constexpr int bit_width ( T x ) noexcept ; template < class T > constexpr int countl_one ( T x ) noexcept ;
If an
were to be returned from the
overload of such functions
then the size of the elements could change. For example, computing
the bit width of a 8-bit integer could generate a
of 64-bit integers as the
output, which would lead to a dramatic change in storage size and performance.
Instead, we propose that all the overloads for
should return element
types which are the same physical size as the element types they are querying.
This would mean that calling
on an
8-bit integer will
return a
containing signed 8-bit values.
When calling the rotate functions
and
functions it is common to
want to rotate all simd elements by the same amount. An overload will be
provided which takes a scalar
value to match
and
in the
header. In this case there is no need to supply an integer of the same
width as the first parameter’s elements (as described above for the
variant) since broadcasting an
to a
has negligible performance
impact.
3. Wording
Below, substitute the � character with a number the editor finds appropriate for the table, paragraph, section or sub-section.
3.1. Modify [version.syn]
In [version.syn] bump the
version.
3.2. Modify [simd.expos]
Note:
is also added by P2663R7 and it is exactly the same.
template < class V > concept simd - type = // exposition only same_as < V , basic_simd < typename V :: value_type , typename V :: abi_type >> && is_default_constructible_v < V > ;
template < class V > concept simd - floating - point = // exposition only same_as < V , basic_simd < typename V :: value_type , typename V :: abi_type >> && is_default_constructible_v < V > && floating_point < typename V :: value_type > ;
3.3. Update the synopsis
In the header
synopsis - [simd.syn] - add at the end after the "Mathematical functions"
// [simd.bit], Bit manipulation
template < simd - type V > constexpr V byteswap ( const V & v ) noexcept ;
template < simd - type V > constexpr V bit_ceil ( const V & v ) noexcept ;
template < simd - type V > constexpr V bit_floor ( const V & v ) noexcept ;
template < simd - type V >
constexpr typename V :: mask_type has_single_bit ( const V & v ) noexcept ;
template < simd - type V0 , simd - type V1 >
constexpr V0 rotl ( const V0 & v , const V1 & s ) noexcept ;
template < simd - type V >
constexpr V rotl ( const V & v , int s ) noexcept ;
template < simd - type V0 , simd - type V1 >
constexpr V0 rotr ( const V0 & v , const V1 & s ) noexcept ;
template < simd - type V >
constexpr V rotr ( const V & v , int s ) noexcept ;
template < simd - type V >
constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V >
bit_width ( const V & v ) noexcept ;
template < simd - type V >
constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V >
countl_zero ( const V & v ) noexcept ;
template < simd - type V >
constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V >
countl_one ( const V & v ) noexcept ;
template < simd - type V >
constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V >
countr_zero ( const V & v ) noexcept ;
template < simd - type V >
constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V >
countr_one ( const V & v ) noexcept ;
template < simd - type V >
constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V >
popcount ( const V & v ) noexcept ;
3.4. Add new section [simd.bit] after [simd.math]
�
bit library [simd.bit]
basic_simd template < simd - type V > constexpr V byteswap ( const V & v ) noexcept ; Constraints:
The type
models
V :: value_type .
integral Returns:
A
object where the
basic_simd th element is initialized to the result of
i for all
std :: byteswap ( v [ i ]) in the range
i .
[ 0 , V :: size ()) template < simd - type V > constexpr V bit_ceil ( const V & v ) noexcept ; Constraints:
The type
is an unsigned integer type ([basic.fundamental]).
V :: value_type Preconditions:
For every
in the range
i , the smallest power of 2 greater than or equal to
[ 0 , V :: size ()) is representable as a value of type
v [ i ] .
V :: value_type Returns:
A
object where the
basic_simd th element is initialized to the result of
i for all
std :: bit_ceil ( v [ i ]) in the range
i .
[ 0 , V :: size ()) Remarks: A function call expression that violates the precondition in the Preconditions: element is not a core constant expression ([expr.const]).
template < simd - type V > constexpr V bit_floor ( const V & v ) noexcept ; Constraints:
The type
is an unsigned integer type ([basic.fundamental]).
V :: value_type Returns:
A
object where the
basic_simd th element is initialized to the result of
i for all
std :: bit_floor ( v [ i ]) in the range
i .
[ 0 , V :: size ()) template < simd - type V > constexpr typename V :: mask_type has_single_bit ( const V & v ) noexcept ; Constraints:
The type
is an unsigned integer type ([basic.fundamental]).
V :: value_type Returns:
A
object where the
basic_simd_mask th element is initialized to the result of
i for all
std :: has_single_bit ( v [ i ]) in the range
i .
[ 0 , V :: size ()) template < simd - type V0 , simd - type V1 > constexpr V0 rotl ( const V0 & v0 , const V1 & v1 ) noexcept ; template < simd - type V0 , simd - type V1 > constexpr V0 rotr ( const V0 & v0 , const V1 & v1 ) noexcept ; Constraints:
The type
is an unsigned integer type ([basic.fundamental]),
V0 :: value_type the type
models
V1 :: value_type ,
integral
is
V0 :: size () == V1 :: size () true
, and
is
sizeof ( typename V0 :: value_type ) == sizeof ( typename V1 :: value_type ) true
.Returns:
A
object where the
basic_simd th element is initialized to the result of
i for all
bit - func ( v0 [ i ], static_cast < int > ( v1 [ i ])) in the range
i , where bit-func is the corresponding scalar function from
[ 0 , V0 :: size ()) .
< bit > template < simd - type V > constexpr V rotl ( const V & v , int s ) noexcept ; template < simd - type V > constexpr V rotr ( const V & v , int s ) noexcept ; Constraints:
The type
is an unsigned integer type ([basic.fundamental])
V :: value_type Returns:
A
object where the
basic_simd th element is initialized to the result of
i for all
bit - func ( v [ i ], s ) in the range
i , where bit-func is the corresponding scalar function from
[ 0 , V :: size ()) .
< bit > template < simd - type V > constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V > bit_width ( const V & v ) noexcept ; template < simd - type V > constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V > countl_zero ( const V & v ) noexcept ; template < simd - type V > constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V > countl_one ( const V & v ) noexcept ; template < simd - type V > constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V > countr_zero ( const V & v ) noexcept ; template < simd - type V > constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V > countr_one ( const V & v ) noexcept ; template < simd - type V > constexpr rebind_simd_t < make_signed_t < typename V :: value_type > , V > popcount ( const V & v ) noexcept ; Constraints:
The type
is an unsigned integer type ([basic.fundamental])
V :: value_type Returns:
A
object where the
basic_simd th element is initialized to the result of
i for all
bit - func ( v [ i ]) in the range
i , where bit-func is the corresponding scalar function from
[ 0 , V :: size ()) .
< bit >
4. Revision History
R3 => R4
-
Minor typo fixes.
-
Strike some unnecessary notes.
-
LEWG confirmed that overloads for
/rotl
should be provided which take a plainrotr
as their second parameter.int -
Modified constraints to be more simply worded.
-
Modified returns test to match [P1928R15].
R2 => R3
-
Add missing synopsis wording addition.
-
Change detailed behaviour descriptions to match the style/wording of recent P1928 updates.
-
Added concepts to detect simd types whose elements are integral, unsigned, and so on.
-
Added overload to rotl/rotr to allow the second parameter to be a scalar.
R1 => R2
-
Added feature test macro
R0 => R1
-
Fix typo:
tostd :: make_signed std :: make_signed_t -
Make several paper text improvements