P2933R4: Extend <bit> header function with overloads for std::simd

1. Motivation

[P1928R7] introduced data parallel types to C++. It mostly provided operators which worked on or with std::simd types, but it also included overloads of useful functions from other parts of C++ (e.g., sin, cos, abs). In this paper we propose some other functions from standard C++ headers which should receive overloads to work with std::simd types. The list isn’t exhaustive, but reflects those functions which are desirable to include.

2. Support for `<bit>`

The <bit> header is part of the numerics library and provides utilities for manipulating and querying the properties of integral values when treated as collections of bits. The table below summarises the contents of <bit>.

Name	Purpose	Proposed (Y/N)
`endian`	A type which indicates the endianness of scalar types.	N
`bit_cast`	reinterpret the object representation of one type as that of another	N
`byteswap`	reverses the bytes in the given integer value	Y
`has_single_bit`	checks if a number is an integral power of two	Y
`bit_ceil`	finds the smallest integral power of two not less than the given value	Y
`bit_floor`	finds the largest integral power of two not greater than the given value	Y
`bit_width`	finds the smallest number of bits needed to represent the given value	Y
`rotl`	computes the result of bitwise left-rotation	Y
`rotr`	computes the result of bitwise right-rotation	Y
`countl_zero`	counts the number of consecutive 0 bits, starting from the most significant bit	Y
`countl_one`	counts the number of consecutive 1 bits, starting from the most significant bit	Y
`countr_zero`	counts the number of consecutive 0 bits, starting from the least significant bit	Y
`countr_one`	counts the number of consecutive 1 bits, starting from the least significant bit	Y
`popcount`	counts the number of 1 bits in an unsigned integer	Y

Of these types and functions, only the first two shouldn’t be handled by std::simd:

endian indicates the endianess of a scalar type. A SIMD value with elements of the underlying scalar type will have the same properties and does not do a special handling in std::simd.
bit_cast should be handled differently for std::simd values, and a separate proposal for simd_bit_cast will be provided.

All the other functions from <bit> should be handled in std::simd by element-wise application of the function to each element of the SIMD value. Any constraints and behaviours on the function will be applied at the SIMD value level. For instance, if byteswap participates in overload resolution only if the argument type satisfies std::integral concept then the overload of byteswap with std::simd parameter has the same constraint for std::simd<T, N>::value_type.

One small modification to the behaviour of <bit> for std::simd is where the return type differs to the input type. For example, the standard <bit> header defines some query functions as returning integer values:

template< class T >
constexpr int bit_width( T x ) noexcept;

template< class T >
constexpr int countl_one( T x ) noexcept;

If an int were to be returned from the std::simd overload of such functions then the size of the elements could change. For example, computing the bit width of a 8-bit integer could generate a std::simd of 64-bit integers as the output, which would lead to a dramatic change in storage size and performance. Instead, we propose that all the overloads for <bit> should return element types which are the same physical size as the element types they are querying. This would mean that calling bit_width on an unsigned 8-bit integer will return a std::simd containing signed 8-bit values.

When calling the rotate functions rotl and rotr functions it is common to want to rotate all simd elements by the same amount. An overload will be provided which takes a scalar int value to match rotl and rotr in the <bit> header. In this case there is no need to supply an integer of the same width as the first parameter’s elements (as described above for the simd variant) since broadcasting an integer to a simd has negligible performance impact.

3. Wording

Below, substitute the � character with a number the editor finds appropriate for the table, paragraph, section or sub-section.

3.1. Modify [version.syn]

In [version.syn] bump the __cpp_lib_simd version.

3.2. Modify [simd.expos]

Note: simd-type is also added by P2663R7 and it is exactly the same.

template<class V>
  concept simd-type = // exposition only
     same_as<V, basic_simd<typename V::value_type, typename V::abi_type>> &&
     is_default_constructible_v<V>;

template<class V>
  concept simd-floating-point = // exposition only
     same_as<V, basic_simd<typename V::value_type, typename V::abi_type>> &&
     is_default_constructible_v<V> && floating_point<typename V::value_type>;

3.3. Update the synopsis

In the header <simd> synopsis - [simd.syn] - add at the end after the "Mathematical functions"

// [simd.bit], Bit manipulation 
template<simd-type V> constexpr V byteswap(const V& v) noexcept;
template<simd-type V> constexpr V bit_ceil(const V& v) noexcept;
template<simd-type V> constexpr V bit_floor(const V& v) noexcept;

template<simd-type V>
  constexpr typename V::mask_type has_single_bit(const V& v) noexcept;

template<simd-type V0, simd-type V1>
  constexpr V0 rotl(const V0& v, const V1& s) noexcept;
template<simd-type V>
  constexpr V  rotl(const V& v, int s) noexcept;

template<simd-type V0, simd-type V1>
  constexpr V0 rotr(const V0& v, const V1& s) noexcept;
template<simd-type V>
  constexpr V  rotr(const V& v, int s) noexcept;

template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  bit_width(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countl_zero(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countl_one(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countr_zero(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countr_one(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  popcount(const V& v) noexcept;

3.4. Add new section [simd.bit] after [simd.math]

� basic_simd bit library [simd.bit]
template<simd-type V> constexpr V byteswap(const V& v) noexcept;
Constraints:

The type V::value_type models integral.

Returns:

A basic_simd object where the i^th element is initialized to the result of std::byteswap(v[i]) for all i in the range [0, V::size()).
template<simd-type V> constexpr V bit_ceil(const V& v) noexcept;
Constraints:

The type V::value_type is an unsigned integer type ([basic.fundamental]).

Preconditions:

For every i in the range [0, V::size()), the smallest power of 2 greater than or equal to v[i] is representable as a value of type V::value_type.

Returns:

A basic_simd object where the i^th element is initialized to the result of std::bit_ceil(v[i]) for all i in the range [0, V::size()).

Remarks: A function call expression that violates the precondition in the Preconditions: element is not a core constant expression ([expr.const]).
template<simd-type V> constexpr V bit_floor(const V& v) noexcept;
Constraints:

The type V::value_type is an unsigned integer type ([basic.fundamental]).

Returns:

A basic_simd object where the i^th element is initialized to the result of std::bit_floor(v[i]) for all i in the range [0, V::size()).
template<simd-type V> constexpr typename V::mask_type has_single_bit(const V& v) noexcept;
Constraints:

The type V::value_type is an unsigned integer type ([basic.fundamental]).

Returns:

A basic_simd_mask object where the i^th element is initialized to the result of std::has_single_bit(v[i]) for all i in the range [0, V::size()).
template<simd-type V0, simd-type V1> constexpr V0 rotl(const V0& v0, const V1& v1) noexcept;
template<simd-type V0, simd-type V1> constexpr V0 rotr(const V0& v0, const V1& v1) noexcept;
Constraints:

The type V0::value_type is an unsigned integer type ([basic.fundamental]),

the type V1::value_type models integral,

V0::size() == V1::size() is true, and

sizeof(typename V0::value_type) == sizeof(typename V1::value_type) is true.

Returns:

A basic_simd object where the i^th element is initialized to the result of bit-func(v0[i], static_cast<int>(v1[i])) for all i in the range [0, V0::size()), where bit-func is the corresponding scalar function from <bit>.
template<simd-type V> constexpr V rotl(const V& v, int s) noexcept;
template<simd-type V> constexpr V rotr(const V& v, int s) noexcept;
Constraints:

The type V::value_type is an unsigned integer type ([basic.fundamental])

Returns:

A basic_simd object where the i^th element is initialized to the result of bit-func(v[i], s) for all i in the range [0, V::size()), where bit-func is the corresponding scalar function from <bit>.
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  bit_width(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countl_zero(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countl_one(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countr_zero(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  countr_one(const V& v) noexcept;
template<simd-type V>
  constexpr rebind_simd_t<make_signed_t<typename V::value_type>, V>
  popcount(const V& v) noexcept;
Constraints:

The type V::value_type is an unsigned integer type ([basic.fundamental])

Returns:

A basic_simd object where the i^th element is initialized to the result of bit-func(v[i]) for all i in the range [0, V::size()), where bit-func is the corresponding scalar function from <bit>.

4. Revision History

R3 => R4

Minor typo fixes.
Strike some unnecessary notes.
LEWG confirmed that overloads for rotl/rotrshould be provided which take a plain int as their second parameter.
Modified constraints to be more simply worded.
Modified returns test to match [P1928R15].

R2 => R3

Add missing synopsis wording addition.
Change detailed behaviour descriptions to match the style/wording of recent P1928 updates.
Added concepts to detect simd types whose elements are integral, unsigned, and so on.
Added overload to rotl/rotr to allow the second parameter to be a scalar.

R1 => R2

Added feature test macro

R0 => R1

Fix typo: std::make_signed to std::make_signed_t
Make several paper text improvements

P2933R4
Extend `<bit>` header function with overloads for `std::simd`

Published Proposal, 2025-02-13

Abstract

1. Motivation

2. Support for `<bit>`

3. Wording

3.1. Modify [version.syn]

3.2. Modify [simd.expos]

3.3. Update the synopsis

3.4. Add new section [simd.bit] after [simd.math]

4. Revision History

References

Informative References

P2933R4Extend <bit> header function with overloads for std::simd

Published Proposal, 2025-02-13

Abstract

1. Motivation

2. Support for <bit>

3. Wording

3.1. Modify [version.syn]

3.2. Modify [simd.expos]

3.3. Update the synopsis

3.4. Add new section [simd.bit] after [simd.math]

4. Revision History

References

Informative References

P2933R4
Extend `<bit>` header function with overloads for `std::simd`

2. Support for `<bit>`