1. Abstract
This paper proposes a set of
-style type aliases for floating point types matching specific, well-know floating-point layouts.
This is a companion paper to [P1467], which allows implementations to define floating-point types beyond the three standard types. This paper gives convenient names to some of those types.
2. Revision history
2.1. R0 -> R1 (pre-Cologne)
-
Add the requirement that the types must not alias any of the standard floating-point types.
-
Add a design question about feature-test macros.
-
Add a section on QoI - should we strongly encourage that the aliases to have a hardware implementation?
2.2. R1 -> R2 (pre-Belfast)
Changes based on feedback in Cologne from SG6, LEWGI, and EWGI. Further changes came from further development of the paper by the authors.
-
Expanded the section about whether or not the fixed-layout aliases are allowed to alias standard floating-point types.
-
Added a section about whether the aliases only need to guarantee layout, or should also guarantee behavior.
-
Added some text, still preliminary, about literal suffixes.
3. Motivation
16-bit floating-point support is becoming more widely available in both hardware (ARM CPUs and NVIDIA GPUs) and software (OpenGL, CUDA, and LLVM IR). Programmers wanting to take advantage of 16-bit floating-point support have been stymied by the lack of built-in compiler support for the type. A common workaround is to define a class type with all of the conversion operators and overloaded arithmetic operators to make it behave as much as possible like a built-in type. But that approach is cumbersome and incomplete, requiring inline assembly or other compiler-specific magic to generate efficient code.
The problem of efficiently using newer floating-point types that haven’t traditionally been supported can’t be solved through user-defined libraries. A possible solution of an implementation changing
to be a 16-bit type would be unpopular because users want support for newer floating-point types in addition to the standard types, and because users have come to expect
and
to be 32- and 64-bit types and have lots of existing code written with that assumption.
This problem is worth solving, and there is no viable solution under the current standard. So changing the core language in an extensible and backward-compatible way is appropriate. Providing a standard way for implementations to support 16-bit floating-point types will result in better code, more portable code, and wider use of those types.
[P1467] changes the language so that implementations can support 16-bit and other non-standard floating-point types. This paper gives well-known names to 16-bit and other commonly used floating-point types.
These two papers are the follow-up to [P0192], the
proposal, which was not approved by EWG.
4. Header name
The type aliases proposed here do not fit neatly into any existing header. So we are offering up a strawman proposal of a new header with the name
. We are open to other names for the header and to arguments that the type aliases should be added to an existing header.
What new or existing header should the type aliases go into?
5. Type aliases
This paper introduces type aliases for several fixed-layout floating-point types. Each alias will be defined only if a type with that layout is supported by the implementation, similar to the
aliases.
5.1. Supported formats
We propose aliases for the following layouts:
-
[IEEE-754-2008]
- IEEE 16-bit.binary16 -
[IEEE-754-2008]
- IEEE 32-bit.binary32 -
[IEEE-754-2008]
- IEEE 64-bit.binary64 -
[IEEE-754-2008]
- IEEE 128-bit.binary128 -
, which isbfloat16
with 16 bits of precision truncated; see [bfloat16].binary32
and
are the most widely used floating-point types, and are the formats that
and
have in most implementations.
is becoming more widely used; see this paper’s motivation for details.
has hardware support in IBM POWER P9 chips.
is used in Google’s TPUs and in TensorFlow.
The most widely used format that is not in this list is X87 80-bit. Even though there is hardware support for this format in all current x86 chips, it is used most often because it is the largest type available, not because users specifically want that format.
5.2. Aliasing standard types
Can the type aliases proposed in this paper be aliases of standard floating-point types? Or are the type aliases required to alias extended floating-point types?
This has turned out to be the most contentious issue raised in this proposal with strong opinions on both sides. In Cologne, SG6 and LEWGI voted in favor of allowing aliasing of standard types, while EWGI was strongly against the idea. The authors are in favor of prohibiting aliasing of standard types, but realize that not everyone else is convinced of that yet.
The header
defines integer type aliases for certain integer types, such as
and
. These are similar in many ways to the aliases proposed here. The types in
are allowed to alias standard integer types. That has resulted in compilation errors when users try to create an overload set with both standard types and fixed-layout aliases, such as:
int bit_count ( int x ) { /* ... */ }
int bit_count ( std :: int32_t x ) { /* ... */ }
If aliasing of standard types is allowed for the floating-point type aliases, then similar compilation errors will likely result:
int get_exponent ( double x ) { /* ... */ }
int get_exponent ( std :: float64_t x ) { /* ... */ }
This is the strongest argument against allowing aliasing of standard types. People who don’t find this argument persuasive point out that users should not create overload sets with both standard types and fixed-layout type aliases. An overload set should contain just the standard floating-point types or just the fixed-layout types, but not both. The example above that fails to compile is considered poor design and should not be encouraged.
(The arguments about overload sets apply equally to explicit template specializations.)
Not allowing the aliasing of standard types imposes an implementation burden. If aliasing were allowed, then implementations that don’t define any extended floating-point types could define some of the aliases with a little bit of library code that boils down to something like:
namespace std {
using float32_t = float ;
using float64_t = double ;
}
But when aliasing is not allowed, implementations have to support extended floating-point types in at least the compiler front end, which is not a trivial task. There is also a burden on the name mangling ABI, which will have to define how to encode these extended floating-point types.
The authors feel that the burden on users of allowing aliasing of standard types is greater than the burden on implementers of not allowing such aliasing. Therefore, the authors recommend not allowing aliasing of standard types.
(This argument is predicated on the changes to overload resolution proposed in [P1467]. If those changes don’t go through, then having
be an alias of an extended floating-point type rather than an alias of
will cause the following code to not compile:
void f ( std :: float32_t );
void f ( std :: float64_t );
void g ( double x ) {
f ( x ); // error - ambiguous call without overload resolution changes
}
If that code doesn’t compile, that would be a bigger burden on users than not being able to overload on both
and
. That would change the authors' opinion on the best resolution for this issue.)
5.3. Layout vs. behavior
Should the aliases for the IEEE types guarantee that the types have fully conformant IEEE behavior, or only that they have the same layout as the IEEE type?
What type aliases should be defined and how should they behave when the user compiles with the GCC option
, which turns on optimizations that result in
having non-IEEE behavior in some cases? In that situation, should
(or whatever it is called) not be defined at all? Or should it be defined and refer to an extended floating-point type that is IEEE conformant? Or should it be defined and refer to a type that, like
, is not fully IEEE conformant? Should this be a requirement on the implementation, or just encouragement?
Discussion in SG6 in Kona preferred having the type aliases guarantee full IEEE behavior.
5.4. Feature test macros
Since implementations may choose to support (or not) each of the fixed-layout aliases individually, there should be a separate test macro for detecting each of the type aliases. The names of the test macros would be derived from whichever type alias names we settle on. (The authors are not thrilled with introducing so many new test macros, but they have yet to come up with a better idea.)
How should feature test macros be handled for this feature?
5.5. Names
We are proposing several different naming schemes for fixed-layout type alias, and are open to other suggested naming schemes. In committee discussions so far, no set of names has emerged as the favorites. We are leaving it up to the committee to chose.
5.5.1. floatX_t
-
std :: float16_t -
std :: float32_t -
std :: float64_t -
std :: float128_t -
std :: bfloat16_t
This is the simplest of all the options being presented. It is the naming scheme used by Boost.Math’s fixed-layout floating-point types.
Nothing in the names of the IEEE aliases implies that they are in fact IEEE binary formats. Additionally,
and
are similar enough that we aren’t fully comfortable using these names.
5.5.2. iec559_binaryX_t
-
std :: iec559_binary16_t -
std :: iec559_binary32_t -
std :: iec559_binary64_t -
std :: iec559_binary128_t -
std :: bfloat16_t
This naming scheme is as explicit as possible about the layouts that it guarantees. "IEC559" is how the rest of the standard refers to [IEEE-754-2008].
These names are long.
will cause some confusion because is not as recognizable as
among non-floating-point experts.
5.5.3. binaryX_t
-
std :: binary16_t -
std :: binary32_t -
std :: binary64_t -
std :: binary128_t -
std :: bfloat16_t
These names are shorter than
, but they are less obvious; nothing in their names directly points to the being floating-point types.
This is the only option that has received strong SG6 discouragement.
5.5.4. fp :: binaryX_t
-
std :: fp :: binary16_t -
std :: fp :: binary32_t -
std :: fp :: binary64_t -
std :: fp :: binary128_t -
std :: fp :: bfloat16_t
The namespace
makes it more obvious that these types are floating-point types, assisting in the recognition of
as an [IEEE-754-2008] format. A using namespace directive can be used to avoid repeating
everywhere.
The drawbacks of this approach are that it introduces a new namespace with a very small purpose, and that
is somewhat redundant with two different floating-point indications (
and the
in
).
5.5.5. fp_binaryX_t
-
std :: fp_binary16_t -
std :: fp_binary32_t -
std :: fp_binary64_t -
std :: fp_binary128_t -
std :: fp_bfloat16_t
This is a slight modification of the previous scheme, which trades the nested namespace for an
prefix. The advantages and disadvantages are similar.
5.5.6. iec559 :: binaryX_t
-
std :: iec559 :: binary16_t -
std :: iec559 :: binary32_t -
std :: iec559 :: binary64_t -
std :: iec559 :: binary128_t -
?? (unsure how to fitstd :: bfloat16_t
into this scheme)bfloat16
This scheme was proposed during SG6 discussions in Kona. It is similar to the namespace
scheme, but with a more precise namespace name.
6. Literal suffixes
Once the names of the aliases have been decided on, a literal suffix for each of those types will be defined, similar to what is proposed in [P1280]. Each type will have either two literal operators with
and
parameters, or (for types whose conversion rank is not less than
) one literal operator with a
parameter. The literal operators for an implementation might look like this (with all names subject to change):
namespace std { inline namespace literals { inline namespace fixed_float_literals { constexpr float16_t operator "" fp16 ( long double ); constexpr float16_t operator "" fp16 ( unsigned long long ); constexpr float32_t operator "" fp32 ( long double ); constexpr float32_t operator "" fp32 ( unsigned long long ); constexpr float64_t operator "" fp64 ( long double ); constexpr float64_t operator "" fp64 ( unsigned long long ); constexpr float128_t operator "" fp128 ( const char * ); constexpr bfloat16_t operator "" bf16 ( long double ); constexpr bfloat16_t operator "" bf16 ( unsigned long long ); } } }
constexpr float16_t operator "" fp16 ( long double d ); constexpr float16_t operator "" fp16 ( unsigned long long d ); Returns:.
float16_t { d }
constexpr float32_t operator "" fp32 ( long double d ); constexpr float32_t operator "" fp32 ( unsigned long long d ); Returns:.
float32_t { d }
constexpr float64_t operator "" fp64 ( long double d ); constexpr float64_t operator "" fp64 ( unsigned long long d ); Returns:.
float64_t { d }
constexpr float128_t operator "" fp128 ( const char * s ); Effects: Equivalent to:
float128_t x { 0 }; from_chars ( s , s + strlen ( s ), & x ); return x ;
constexpr bfloat16_t operator "" bf16 ( long double d ); constexpr bfloat16_t operator "" bf16 ( unsigned long long d ); Returns:.
bfloat16_t { d }