Doc No: SC22/WG21/N1965 J16/06-0035 Date: 2006-02-25 Project: JTC1.22.32 Reply to: Robert Klarer IBM Canada, Ltd. klarer@ca.ibm.com
[Editor's note: this document is being proposed (by me) as a basis for continuing work. None of the design choices implied in this work is final. Better ideas, as well as comments and critiques will be gratefully received.]
This document is known to be incomplet, inkorrect, and badly formatted.
Most of today's general purpose computing architectures provide binary floating-point arithmetic in hardware. Binary float-point is an efficient representation that minimizes memory use, and is simpler to implement than floating-point arithmetic using other bases. It has therefore become the norm for scientific computations, with almost all implementations following the IEEE-754 standard for binary floating-point arithmetic.
However, human computation and communication of numeric values almost always uses decimal arithmetic, and decimal notations. Laboratory notes, scientific papers, legal documents, business reports and financial statements all record numeric values in decimal form. When numeric data are given to a program or are displayed to a user, binary to-and-from decimal conversion is required. There are inherent rounding errors involved in such conversions; decimal fractions cannot, in general, be represented exactly by floating-point values. These errors often cause usability and efficiency problems, depending on the application.
These problems are minor when the application domain accepts, or requires results to have, associated error estimates (as is the case with scientific applications). However, in business and financial applications, computations are either required to be exact (with no rounding errors) unless explicitly rounded, or be supported by detailed analyses that are auditable to be correct. Such applications therefore have to take special care in handling any rounding errors introduced by the computations.
The most efficient way to avoid conversion error is to use decimal arithmetic. Recognizing this, the IEEE-754R Standard for Floating-Point Arithmetic specifies decimal floating-point encodings and arithmetic. This technical report specifies extensions to the International Standard for the C++ programming language to permit the use of decimal arithmetic in a manner consistent with the IEEE-754R standard.
This Technical Report is based on a model of decimal arithmetic which is a formalization of the decimal system of numeration (Algorism) as further defined and constrained by the relevant standards, IEEE-854, ANSI X3-274, and IEEE-754R.
There are three components to the model:
The model defines these components in the abstract. It defines neither the way in which operations are expressed (which might vary depending on the computer language or other interface being used), nor the concrete representation (specific layout in storage, or in a processor's register, for example) of numbers or context.
From the perspective of the C++ language, numbers are represented by data types, operations are defined within expressions, and context
is the floating environment specified in <cfenv>
and <fenv.h>
. This Technical Report
specifies how the C++ language implements these components.
Note: A description of the arithmetic model can be found in http://www2.hursley.ibm.com/decimal/decarith.html.
In the C++ International Standard, the representation of a floating-point number is specified in an abstract form where the constituent components of the representation are defined (sign, exponent, significand) but the internals of these components are not. In particular, the exponent range, significand size and the base (or radix), are implementation defined. This allows flexibility for an implementation to take advantage of its underlying hardware architecture. Furthermore, certain behaviors of operations are also implementation defined, for example in the area of handling of special numbers and in exceptions.
This approach was inherited from the C programming language. At the time that C was first standardized, there were already various hardware implementations of floating-point arithmetic in common use. Specifying the exact details of a representation would make most of the existing C implementations at the time not conforming.
The C99 standard specifies a binding to IEEE-754 (annex F). Still, conformance to IEEE-754 is not a mandatory requirement. A C99 implementation that conforms to IEEE-754 defines the macro __STDC_IEC_559__.
This Technical Report specifies decimal floating-point arithmetic according to the IEEE-754R standard, with the constituent components of the representation defined. This is more stringent than the approach taken for the floating point types in the C++ standard. Since it is expected that all decimal floating-point hardware implementations will conform to the IEEE-754R standard, binding to this standard directly benefits both implementers and programmers.
The following standards contain provisions which, through reference in this text, constitute provisions of this Technical Report. For dated references, subsequent amendment to, or revisions of, any of these publications do not apply. However, parties to agreements based on this Technical Report are encouraged to investigate the possibility of applying the most recent editions of the normative documents indicated below. For undated references, the latest edition of the normative document referred applies. Members of the IEC and ISO maintain registers of current valid International Standards.
1.3.1 ISO/IEC 14882:2003, Information technology -- Programming languages, their environments and system software interfaces -- Programming Language C++.
1.3.2 ISO/IEC TR 19768:2005, Information technology -- Programming languages, their environments and system software interfaces -- Technical Report on C++ Library Extensions.
1.3.3 ANSI/IEEE 754R - IEEE Standard for Floating-Point Arithmetic. The Institute of Electrical and Electronic Engineers, Inc..
1.3.4 ANSI/IEEE 854-1997 - IEEE Standard for Radix-Independent Floating-Point Arithmetic. The Institute of Electrical and Electronic Engineers, Inc., New York, 1987.
This technical report is non-normative; the extensions that it describes may be considered for inclusion in a future revision of the International Standard for C++, but they are not currently required for conformance to that standard. Furthermore, it is conceivable that a future revision of the International Standard will include facilities that are similar and not identical to the extensions described in this report,
Although this report describes extensions to the C++ standard library, vendors may choose to implement these extensions in the C++ language translator itself. This practice is permitted so long as all well-formed programs are accepted by the implementation, and the semantics of those programs are the same as they would be had the extensions taken the form of a library. [Note: The same practice is permitted with respect to the implementation of the C++ standard library. --end note]
The result of deriving a user-defined type from std::dfp::decimal32
, std::dfp::decimal64
, or std::dfp::decimal128
is undefined.
Unless otherwise specified, the whole of the ISO C++ Standard Library introduction [lib.library] is included into this Technical Report by reference.
This Technical Report introduces the following elements to supplement those described in [lib.structure.specifications]:
Unless otherwise specified, the following sections of ISO/IEC Technical Report 19768: "Technical Report on C++ Library Extensions" are included into this Technical Report by reference:
<functional>
synopsis [tr.unord.fun.syn]hash
[tr.unord.hash]This technical report describes 4 categories of library extensions:
dfp::decimal32
in the <dec32>
header.
<cmath>
and <math.h>
in clause 3.7.
is_decimal_floating_point
added to the header <type_traits>
in clause 3.12.
std::numeric_limits
in clauses 3.2.16, 3.3.16, and 3.4.16.
New headers are distinguished from extensions to existing headers by the title of the synopsis clause. In the first case the title is of the form "Header <foo>
synopsis," and the synopsis includes all namespace scope declarations contained in the header. In the second case the title is of the form "Additions to header <foo>
synopsis" and the synopsis includes only the extensions, i.e. those namespace scope declarations that are not present in the C++ standard or TR1.
The extensions described in this technical report are declared within the namespace dfp
, which is nested inside the namespace std
.
Unless otherwise specified, references to other entities described in this technical report are assumed to be qualified with std::dfp::
, references to entities described in the C++ standard library are assumed to be qualified with std::
, and references to entities described in TR1 are assumed to be qualified with std::tr1::
.
Even when an extension is specified as additions to standard headers (the second and third categories in section 2.3), vendors should not simply add declarations to standard headers in a way that would be visible to users by default [Note: That would fail to be standard conforming, because the new names, even within a namespace, could conflict with user macros. --end note] Users should be required to take explicit action to have access to library extensions. It is recommended either that additional declarations in standard headers be protected with a macro that is not defined by default, or else that all extended headers, including both new headers and parallel versions of standard headers with nonstandard declarations, be placed in a separate directory that is not part of the default search path.
This Technical Report introduces three decimal floating-point types, named decimal32, decimal64, and decimal128. The set of values of type decimal32 is a subset of the set of values of type decimal64; the set of values of the type decimal64 is a subset of the set of values of the type decimal128. Support for decimal128 is optional.
The three decimal encoding formats defined in IEEE-754R correspond to the three decimal floating types as follows:
[Note: this implies that sizeof(std::dfp::decimal32) == 4
, sizeof(std::dfp::decimal64) == 8
, and sizeof(std::dfp::decimal128) == 16
. --end note]
The finite numbers are defined by a sign, an exponent, (which is a power of ten), and a decimal integer coefficient. The value of a finite number is given by (1)sign x coefficient x 10exponent. Refer to IEEE-754R for details of the format.
These formats are characterized by the length of the coefficient, and the maximum and minimum exponent. Table 1 shows these characteristics by format.
Format | decimal32 | decimal64 | decimal128 |
Coefficient length in digits | 7 | 16 | 34 |
Maximum Exponent (Emax) | 96 | 384 | 6144 |
Minimum Exponent (Emin) | -95 | -383 | -6143 |
<dec32>
synopsis
#include <iosfwd> namespace std { namespace dfp { class decimal32; // 3.2.9 initialization from coefficient and exponent: decimal32 make_decimal32(long long coeff, int exponent); decimal32 make_decimal32(unsigned long long coeff, int exponent); // 3.2.10 conversion to generic floating-point type: long double decimal32_to_long_double(decimal32 d); long double decimal_to_long_double(decimal32 d); // 3.2.11 unary arithmetic operators: decimal32 operator+(const decimal32 & lhs); decimal32 operator-(const decimal32 & lhs); // 3.2.12 binary arithmetic operators: template <class LHS> implementation-defined operator+(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator+(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator-(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator-(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator*(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator*(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator/(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator/(const decimal32 & lhs, const RHS & rhs); // 3.2.13 comparison operators: template <class LHS> implementation-defined operator==(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator==(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator!=(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator!=(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator<(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator<(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator<=(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator<=(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator>(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator>(const decimal32 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator>=(const LHS & lhs, const decimal32 & rhs); template <class RHS> implementation-defined operator>=(const decimal32 & lhs, const RHS & rhs); // 3.2.14 Formatted input: template <class charT, class traits> std::basic_istream<charT, traits> & operator>>(std::basic_istream<charT, traits> & is, const decimal32 & d); // 3.2.15 Formatted output: template <class charT, class traits> std::basic_ostream<charT, traits> & operator<<(std::basic_ostream<charT, traits> & os, const decimal32 & d); } }
decimal32
namespace std { namespace dfp { class decimal32 { public: // 3.2.3 construct/copy/destroy: decimal32(); decimal32(const decimal32 & d32); decimal32 & operator=(const decimal32 & d32); ~decimal32(); // 3.2.4 conversion from floating-point type: explicit decimal32(const decimal64 & d64); explicit decimal32(const decimal128 & d128); explicit decimal32(float r); explicit decimal32(double r); explicit decimal32(long double r); // 3.2.5 conversion from integral type: decimal32(int z); decimal32(unsigned int z); decimal32(long z); decimal32(unsigned long z); decimal32(long long z); decimal32(unsigned long long z); // 3.2.6 conversion to integral type: operator long long() const; // 3.2.7 increment and decrement operators: decimal32 & operator++(); decimal32 operator++(int); decimal32 & operator--(); decimal32 operator--(int); // 3.2.8 compound assignment: template <class T> implementation-defined operator+=(const T & rhs); template <class T> implementation-defined operator-=(const T & rhs); template <class T> implementation-defined operator*=(const T & rhs); template <class T> implementation-defined operator/=(const T & rhs); }; } }
decimal32();
Effects: Constructs an object of type decimal32 with the value 0;
decimal32(const decimal32 & d32); decimal32 & operator=(const decimal32 & d32);
Effects: Copies an object of type decimal32.
~decimal32();
Effects: Destroys an object of type decimal32.
explicit decimal32(const decimal64 & d64);
Effects: Constructs an object of type decimal32 by converting from type decimal64. Conversion is performed as in IEEE-754R.
explicit decimal32(const decimal128 & d128);
Effects: Constructs an object of type decimal32 by converting from type decimal128. Conversion is performed as in IEEE-754R.
explicit decimal32(float r);
Effects: Constructs an object of type decimal32 by converting from type
float
. Ifstd::numeric_limits<float>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
explicit decimal32(double r);
Effects: Constructs an object of type decimal32 by converting from type
double
. Ifstd::numeric_limits<double>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
explicit decimal32(long double r);
Effects: Constructs an object of type decimal32 by converting from type
long double
. Ifstd::numeric_limits<long double>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
decimal32(int z); decimal32(unsigned int z); decimal32(long z); decimal32(unsigned long z); decimal32(long long z); decimal32(unsigned long long z);
Effects: Constructs an object of type decimal32 by converting from the type of z. Conversion is performed as in IEEE-754R.
operator long long() const;
Returns: Returns the result of the conversion of
*this
to the typelong long
, as if performed by the expressionllroundd32(*this)
.
decimal32 & operator++();
Effects: Adds 1 to
*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal32 operator++(int);
Effects:
decimal32 tmp = *this; *this += 1; return tmp;
decimal32 & operator--();
Effects: Subtracts 1 from
*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal32 operator--(int);
Effects:
decimal32 tmp = *this; *this -= 1; return tmp;
template <class T> implementation-defined operator+=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Adds rhs to*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator-=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Subtracts rhs from*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator*=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Multiplies*this
by rhs, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator/=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Divides*this
by rhs, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal32 make_decimal32(long long coeff, int exponent); decimal32 make_decimal32(unsigned long long coeff, int exponent);
Returns:
powd32(coeff, exponent)
long double decimal32_to_long_double(decimal32 d); long double decimal_to_long_double(decimal32 d);
Returns: If
std::numeric_limits<long double>::is_iec559 == true
, returns the result of the conversion of*this
tolong double
, performed as in IEEE-754R. Otherwise, the returned value is implementation-defined.
[Editor's note: this notation is ugly. A user-defined converson operator would be vastly preferable to these functions but, alas, user-defined conversion operators cannot be explicit
. A previous draft of this document specified a decimal32
member function named to_long_double()
that had the same result as these functions. The current "free function" approach is better because it works regardless of whether the implementation of these types uses library classes or compiler builtins. The decimal32_to_long_double
form is provided for C programmers who want to write code that works equally well in C++.]
decimal32 operator+(const decimal32 & lhs);
Returns: Adds lhs to
0
, as in IEEE-754R, and returns the result.
decimal32 operator-(const decimal32 & lhs);
Returns: Subtracts lhs from
0
, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator+(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Adds rhs to lhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator+(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Adds rhs to lhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator-(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Subtracts rhs to lhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator-(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Subtracts rhs from lhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator*(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Multiplies lhs by rhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator*(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Multiplies lhs by rhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator/(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Divides lhs by rhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator/(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Divides lhs by rhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator==(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator==(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator!=(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is not exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator!=(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is not exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator<(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is less than rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator<(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is less than rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator<=(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is less than or equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator<=(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is less than or equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator>(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is greater than rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator>(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is greater than rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator>=(const LHS & lhs, const decimal32 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is greater than or equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator>=(const decimal32 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is greater than or equal to rhs according to IEEE-754R,false
otherwise.
template <class charT, class traits> std::basic_istream<charT, traits> & operator>>(std::basic_istream<charT, traits> & is, const decimal32 & d);
Effects: This function constructs an object of class
std::basic_istream<charT, traits>::sentry
. If thesentry
object returnstrue
when converted to a value of type bool, input is extracted as if by the following code fragment:typedef extended_num_get<charT, std::istreambuf_iterator<charT, traits> > extnumget; std::ios_base::iostate err = 0; std::use_facet<extnumget>(is.getloc()).get(*this, 0, *this, err, d); setstate(err);If an exception is thrown during input then
std::ios::badbit
is set in the error state of the input stream is. If(is.exceptions() & std::ios_base::badbit) != 0
then the exception is rethrown. In any case, the formatted input function destroys thesentry
object.Returns: is.
template <class charT, class traits> std::basic_ostream<charT, traits> & operator<<(std::basic_ostream<charT, traits> & os, const decimal32 & d);
Effects: This function constructs an object of class
std::basic_ostream<charT, traits>::sentry
. If thesentry
object returnstrue
when converted to a value of type bool, output is generated as if by the following code fragment:typedef extended_num_put<charT, std::ostreambuf_iterator<charT, traits> > extnumput; bool failed = std::use_facet<extnumput>(os.getloc()).put(*this, *this, os.fill(), d).failed(); if (failed) { os.setstate(std::ios_base::failbit); }If an exception is thrown during output then
std::ios::badbit
is set in the error state of the input stream os. If(os.exceptions() & std::ios_base::badbit) != 0
then the exception is rethrown. In any case, the formatted output function destroys thesentry
object.Returns: os.
<limits>
The standard template std::numeric_limits
shall be specialized for the decimal32
type.
[Example:
namespace std { template<> class numeric_limits<dfp::decimal32> { public: static const bool is_specialized = true; static dfp::decimal32 min() throw() { return DEC32_MIN; } static dfp::decimal32 max() throw() { return DEC32_MAX; } static const int digits = 7; static const int digits10 = digits; static const int max_digits10 = digits; static const bool is_signed = true; static const bool is_integer = false; static const bool is_exact = false; static const int radix = 10; static dfp::decimal32 epsilon() throw() { return DEC32_EPSILON; } static dfp::decimal32 round_error() throw() { return ...; } static const int min_exponent = -95; static const int min_exponent10 = min_exponent; static const int max_exponent = 96; static const int max_exponent10 = max_exponent; static const bool has_infinity = true; static const bool has_quiet_NaN = true; static const bool has_signaling_NaN = true; static const float_denorm_style has_denorm = denorm_present; static const bool has_denorm_loss = true; static dfp::decimal32 infinity() throw() { return ...; } static dfp::decimal32 quiet_NaN() throw() { return ...; } static dfp::decimal32 signaling_NaN() throw() { return ...; } static dfp::decimal32 denorm_min() throw() { return DEC32_DEN; } static const bool is_iec559 = false; static const bool is_bounded = true; static const bool is_modulo = false; static const bool traps = true; static const bool tinyness_before = true; static const float_round_style round_style = round_indeterminate; }; }
--end example]
<dec64>
synopsis
#include <iosfwd> namespace std { namespace dfp { class decimal64; // 3.3.9 initialization from coefficient and exponent: decimal64 make_decimal64(long long coeff, int exponent); decimal64 make_decimal64(unsigned long long coeff, int exponent); // 3.3.10 conversion to generic floating-point type: long double decimal64_to_long_double(decimal64 d); long double decimal_to_long_double(decimal64 d); // 3.3.11 unary arithmetic operators: decimal64 operator+(const decimal64 & lhs); decimal64 operator-(const decimal64 & lhs); // 3.3.12 binary arithmetic operators: template <class LHS> implementation-defined operator+(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator+(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator-(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator-(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator*(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator*(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator/(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator/(const decimal64 & lhs, const RHS & rhs); // 3.3.13 comparison operators: template <class LHS> implementation-defined operator==(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator==(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator!=(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator!=(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator<(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator<(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator<=(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator<=(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator>(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator>(const decimal64 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator>=(const LHS & lhs, const decimal64 & rhs); template <class RHS> implementation-defined operator>=(const decimal64 & lhs, const RHS & rhs); // 3.3.14 Formatted input template <class charT, class traits> std::basic_istream<charT, traits> & operator>>(std::basic_istream<charT, traits> & is, const decimal64 & d); // 3.3.15 Formatted output template <class charT, class traits> std::basic_ostream<charT, traits> & operator<<(std::basic_ostream<charT, traits> & os, const decimal64 & d); } }
decimal64
namespace std { namespace dfp { class decimal64 { public: // 3.2.3 construct/copy/destroy: decimal64(); decimal64(const decimal64 & d64); decimal64 & operator=(const decimal64 & d64); ~decimal64(); // 3.3.4 conversion from floating-point type: decimal64(const decimal32 & d32); explicit decimal64(const decimal128 & d128); explicit decimal64(float r); explicit decimal64(double r); explicit decimal64(long double r); // 3.3.5 conversion from integral type: decimal64(int z); decimal64(unsigned int z); decimal64(long z); decimal64(unsigned long z); decimal64(long long z); decimal64(unsigned long long z); // 3.3.6 conversion to integral type: operator long long() const; // 3.3.7 increment and decrement operators: decimal64 & operator++(); decimal64 operator++(int); decimal64 & operator--(); decimal64 operator--(int); // 3.3.8 compound assignment: template <class T> implementation-defined operator+=(T rhs); template <class T> implementation-defined operator-=(T rhs); template <class T> implementation-defined operator*=(T rhs); template <class T> implementation-defined operator/=(T rhs); }; } }
decimal64();
Effects: Constructs an object of type decimal64 with the value 0;
decimal64(const decimal64 & d64); decimal64 & operator=(const decimal64 & d64);
Effects: Copies an object of type decimal64.
~decimal64();
Effects: Destroys an object of type decimal64.
decimal64(const decimal32 & d32);
Effects: Constructs an object of type decimal64 by converting from type decimal32. Conversion is performed as in IEEE-754R.
explicit decimal64(const decimal128 & d128);
Effects: Constructs an object of type decimal64 by converting from type decimal128. Conversion is performed as in IEEE-754R.
explicit decimal64(float r);
Effects: Constructs an object of type decimal64 by converting from type
float
. Ifstd::numeric_limits<float>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
explicit decimal64(double r);
Effects: Constructs an object of type decimal64 by converting from type
double
. Ifstd::numeric_limits<double>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
explicit decimal64(long double r);
Effects: Constructs an object of type decimal64 by converting from type
long double
. Ifstd::numeric_limits<long double>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
decimal64(int z); decimal64(unsigned int z); decimal64(long z); decimal64(unsigned long z); decimal64(long long z); decimal64(unsigned long long z);
Effects: Constructs an object of type decimal64 by converting from the type of z. Conversion is performed as in IEEE-754R.
operator long long() const;
Returns: Returns the result of the conversion of
*this
to the typelong long
, as if performed by the expressionllroundd64(*this)
.
decimal64 & operator++();
Effects: Adds 1 to
*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal64 operator++(int);
Effects:
decimal64 tmp = *this; *this += 1; return tmp;
decimal64 & operator--();
Effects: Subtracts 1 from
*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal64 operator--(int);
Effects:
decimal64 tmp = *this; *this -= 1; return tmp;
template <class T> implementation-defined operator+=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Adds rhs to*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator-=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Subtracts rhs from*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator*=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Multiplies*this
by rhs, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator/=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Divides*this
by rhs, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal64 make_decimal64(long long coeff, int exponent); decimal64 make_decimal64(unsigned long long coeff, int exponent);
Returns:
powd64(coeff, exponent)
long double decimal64_to_long_double(decimal64 d); long double decimal_to_long_double(decimal64 d);
Returns: If
std::numeric_limits<long double>::is_iec559 == true
, returns the result of the conversion of*this
tolong double
, performed as in IEEE-754R. Otherwise, the returned value is implementation-defined.
[Editor's note: this notation is ugly. A user-defined converson operator would be vastly preferable to these functions but, alas, user-defined conversion operators cannot be explicit
. A previous draft of this document specified a decimal64
member function named to_long_double()
that had the same result as these functions. The current "free function" approach is better because it works regardless of whether the implementation of these types uses library classes or compiler builtins. The decimal64_to_long_double
form is provided for C programmers who want to write code that works equally well in C++.]
decimal64 operator+(const decimal64 & lhs);
Returns: Adds lhs to
0
, as in IEEE-754R, and returns the result.
decimal64 operator-(const decimal64 & lhs);
Returns: Subtracts lhs from
0
, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator+(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Adds rhs to lhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator+(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Adds rhs to lhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator-(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Subtracts rhs to lhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator-(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Subtracts rhs from lhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator*(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Multiplies lhs by rhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator*(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Multiplies lhs by rhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator/(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Divides lhs by rhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator/(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Divides lhs by rhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator==(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator==(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator!=(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is not exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator!=(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is not exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator<(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is less than rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator<(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is less than rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator<=(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is less than or equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator<=(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is less than or equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator>(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is greater than rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator>(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is greater than rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator>=(const LHS & lhs, const decimal64 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is greater than or equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator>=(const decimal64 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is greater than or equal to rhs according to IEEE-754R,false
otherwise.
template <class charT, class traits> std::basic_istream<charT, traits> & operator>>(std::basic_istream<charT, traits> & is, const decimal64 & d);
Effects: This function constructs an object of class
std::basic_istream<charT, traits>::sentry
. If thesentry
object returnstrue
when converted to a value of type bool, input is extracted as if by the following code fragment:typedef extended_num_get<charT, std::istreambuf_iterator<charT, traits> > extnumget; std::ios_base::iostate err = 0; std::use_facet<extnumget>(is.getloc()).get(*this, 0, *this, err, d); setstate(err);If an exception is thrown during input then
std::ios::badbit
is set in the error state of the input stream is. If(is.exceptions() & std::ios_base::badbit) != 0
then the exception is rethrown. In any case, the formatted input function destroys thesentry
object.Returns: is.
template <class charT, class traits> std::basic_ostream<charT, traits> & operator<<(std::basic_ostream<charT, traits> & os, const decimal64 & d);
Effects: This function constructs an object of class
std::basic_ostream<charT, traits>::sentry
. If thesentry
object returnstrue
when converted to a value of type bool, output is generated as if by the following code fragment:typedef extended_num_put<charT, std::ostreambuf_iterator<charT, traits> > extnumput; bool failed = std::use_facet<extnumput>(os.getloc()).put(*this, *this, os.fill(), d).failed(); if (failed) { os.setstate(std::ios_base::failbit); }If an exception is thrown during output then
std::ios::badbit
is set in the error state of the input stream os. If(os.exceptions() & std::ios_base::badbit) != 0
then the exception is rethrown. In any case, the formatted output function destroys thesentry
object.Returns: os.
<limits>
The standard template std::numeric_limits
shall be specialized for the decimal64
type.
[Example:
namespace std { template<> class numeric_limits<dfp::decimal64> { public: static const bool is_specialized = true; static dfp::decimal64 min() throw() { return DEC64_MIN; } static dfp::decimal64 max() throw() { return DEC64_MAX; } static const int digits = 16; static const int digits10 = digits; static const int max_digits10 = digits; static const bool is_signed = true; static const bool is_integer = false; static const bool is_exact = false; static const int radix = 10; static dfp::decimal64 epsilon() throw() { return DEC64_EPSILON; } static dfp::decimal64 round_error() throw() { return ...; } static const int min_exponent = -383; static const int min_exponent10 = min_exponent; static const int max_exponent = 384; static const int max_exponent10 = max_exponent; static const bool has_infinity = true; static const bool has_quiet_NaN = true; static const bool has_signaling_NaN = true; static const float_denorm_style has_denorm = denorm_present; static const bool has_denorm_loss = true; static dfp::decimal64 infinity() throw() { return ...; } static dfp::decimal64 quiet_NaN() throw() { return ...; } static dfp::decimal64 signaling_NaN() throw() { return ...; } static dfp::decimal64 denorm_min() throw() { return DEC64_DEN; } static const bool is_iec559 = false; static const bool is_bounded = true; static const bool is_modulo = false; static const bool traps = true; static const bool tinyness_before = true; static const float_round_style round_style = round_indeterminate; }; }
--end example]
<dec128>
synopsis
#include <iosfwd> namespace std { namespace dfp { class decimal128; // 3.4.9 initialization from coefficient and exponent: decimal128 make_decimal128(long long coeff, int exponent); decimal128 make_decimal128(unsigned long long coeff, int exponent); // 3.4.10 conversion functions: long double decimal128_to_long_double(decimal128 d); long double decimal_to_long_double(decimal128 d); // 3.4.11 unary arithmetic operators: decimal128 operator+(const decimal128 & lhs); decimal128 operator-(const decimal128 & lhs); // 3.4.12 binary arithmetic operators: template <class LHS> implementation-defined operator+(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator+(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator-(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator-(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator*(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator*(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator/(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator/(const decimal128 & lhs, const RHS & rhs); // 3.4.13 comparison operators: template <class LHS> implementation-defined operator==(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator==(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator!=(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator!=(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator<(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator<(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator<=(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator<=(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator>(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator>(const decimal128 & lhs, const RHS & rhs); template <class LHS> implementation-defined operator>=(const LHS & lhs, const decimal128 & rhs); template <class RHS> implementation-defined operator>=(const decimal128 & lhs, const RHS & rhs); // 3.4.14 Formatted input template <class charT, class traits> std::basic_istream<charT, traits> & operator>>(std::basic_istream<charT, traits> & is, const decimal128 & d); // 3.4.15 Formatted output template <class charT, class traits> std::basic_ostream<charT, traits> & operator<<(std::basic_ostream<charT, traits> & os, const decimal128 & d); } }
decimal128
namespace std { namespace dfp { class decimal128 { public: // 3.4.3 construct/copy/destroy: decimal128(); decimal128(const decimal128 & d128); decimal128 & operator=(const decimal128 & d128); ~decimal128(); // 3.4.4 conversion from floating-point type: decimal128(const decimal32 & d32); decimal128(const decimal64 & d64); explicit decimal128(float r); explicit decimal128(double r); explicit decimal128(long double r); // 3.4.5 conversion from integral type: decimal128(int z); decimal128(unsigned int z); decimal128(long z); decimal128(unsigned long z); decimal128(long long z); decimal128(unsigned long long z); // 3.4.6 conversion to integral type: operator long long() const; // 3.4.7 increment and decrement operators: decimal128 & operator++(); decimal128 operator++(int); decimal128 & operator--(); decimal128 operator--(int); // 3.4.8 compound assignment: template <class T> implementation-defined operator+=(T rhs); template <class T> implementation-defined operator-=(T rhs); template <class T> implementation-defined operator*=(T rhs); template <class T> implementation-defined operator/=(T rhs); }; } }
decimal128();
Effects: Constructs an object of type decimal128 with the value 0;
decimal128(const decimal128 & d128); decimal128 & operator=(const decimal128 & d128);
Effects: Copies an object of type decimal128.
~decimal128();
Effects: Destroys an object of type decimal128.
decimal128(const decimal32 & d32);
Effects: Constructs an object of type decimal32 by converting from type decimal32. Conversion is performed as in IEEE-754R.
decimal128(const decimal64 & d64);
Effects: Constructs an object of type decimal128 by converting from type decimal64. Conversion is performed as in IEEE-754R.
explicit decimal128(float r);
Effects: Constructs an object of type decimal128 by converting from type
float
. Ifstd::numeric_limits<float>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
explicit decimal128(double r);
Effects: Constructs an object of type decimal128 by converting from type
double
. Ifstd::numeric_limits<float>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
explicit decimal128(long double r);
Effects: Constructs an object of type decimal128 by converting from type
long double
. Ifstd::numeric_limits<float>::is_iec559 == true
then the conversion is performed as in IEEE-754R. Otherwise, the result of the conversion is implementation-defined.
decimal128(int z); decimal128(unsigned int z); decimal128(long z); decimal128(unsigned long z); decimal128(long long z); decimal128(unsigned long long z);
Effects: Constructs an object of type decimal128 by converting from the type of z. Conversion is performed as in IEEE-754R.
operator long long() const;
Returns: Returns the result of the conversion of
*this
to the typelong long
, as if performed by the expressionllroundd128(*this)
.
decimal128 & operator++();
Effects: Adds 1 to
*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal128 operator++(int);
Effects:
decimal128 tmp = *this; *this += 1; return tmp;
decimal128 & operator--();
Effects: Subtracts 1 from
*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal64 operator--(int);
Effects:
decimal128 tmp = *this; *this -= 1; return tmp;
template <class T> implementation-defined operator+=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Adds rhs to*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator-=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Subtracts rhs from*this
, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator*=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Multiplies*this
by rhs, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
template <class T> implementation-defined operator/=(const T & rhs);
Constraints:
T
is one of the integral types, or one of the decimal floating-point types.
Effects: Divides*this
by rhs, as in IEEE-754R, and assigns the result to*this
.
Returns:*this
decimal128 make_decimal128(long long coeff, int exponent); decimal128 make_decimal128(unsigned long long coeff, int exponent);
Returns:
powd128(coeff, exponent)
long double decimal128_to_long_double(decimal128 d); long double decimal_to_long_double(decimal128 d);
Returns: If
std::numeric_limits<long double>::is_iec559 == true
, returns the result of the conversion of*this
tolong double
, performed as in IEEE-754R. Otherwise, the returned value is implementation-defined.
[Editor's note: this notation is ugly. A user-defined converson operator would be vastly preferable to these functions but, alas, user-defined conversion operators cannot be explicit
. A previous draft of this document specified a decimal128
member function named to_long_double()
that had the same result as these functions. The current "free function" approach is better because it works regardless of whether the implementation of these types uses library classes or compiler builtins. The decimal128_to_long_double
form is provided for C programmers who want to write code that works equally well in C++.]
decimal128 operator+(const decimal128 & lhs);
Returns: Adds lhs to
0
, as in IEEE-754R, and returns the result.
decimal128 operator-(const decimal128 & lhs);
Returns: Subtracts lhs from
0
, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator+(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Adds rhs to lhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator+(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Adds rhs to lhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator-(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Subtracts rhs to lhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator-(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Subtracts rhs from lhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator*(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Multiplies lhs by rhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator*(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Multiplies lhs by rhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator/(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns: Divides lhs by rhs, as in IEEE-754R, and returns the result.
template <class RHS> implementation-defined operator/(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns: Divides lhs by rhs, as in IEEE-754R, and returns the result.
template <class LHS> implementation-defined operator==(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator==(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator!=(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is not exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator!=(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is not exactly equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator<(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is less than rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator<(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is less than rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator<=(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is less than or equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator<=(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is less than or equal to rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator>(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is greater than rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator>(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is greater than rhs according to IEEE-754R,false
otherwise.
template <class LHS> implementation-defined operator>=(const LHS & lhs, const decimal128 & rhs);
Constraints:
LHS
is one of the integral types or one of the decimal floating-point types.
Returns:true
if lhs is greater than or equal to rhs according to IEEE-754R,false
otherwise.
template <class RHS> implementation-defined operator>=(const decimal128 & lhs, const RHS & rhs);
Constraints:
RHS
is one of the integral types.
Returns:true
if lhs is greater than or equal to rhs according to IEEE-754R,false
otherwise.
template <class charT, class traits> std::basic_istream<charT, traits> & operator>>(std::basic_istream<charT, traits> & is, const decimal128 & d);
Effects: This function constructs an object of class
std::basic_istream<charT, traits>::sentry
. If thesentry
object returnstrue
when converted to a value of type bool, input is extracted as if by the following code fragment:typedef extended_num_get<charT, std::istreambuf_iterator<charT, traits> > extnumget; std::ios_base::iostate err = 0; std::use_facet<extnumget>(is.getloc()).get(*this, 0, *this, err, d); setstate(err);If an exception is thrown during input then
std::ios::badbit
is set in the error state of the input stream is. If(is.exceptions() & std::ios_base::badbit) != 0
then the exception is rethrown. In any case, the formatted input function destroys thesentry
object.Returns: is.
template <class charT, class traits> std::basic_ostream<charT, traits> & operator<<(std::basic_ostream<charT, traits> & os, const decimal128 & d);
Effects: This function constructs an object of class
std::basic_ostream<charT, traits>::sentry
. If thesentry
object returnstrue
when converted to a value of type bool, output is generated as if by the following code fragment:typedef extended_num_put<charT, std::ostreambuf_iterator<charT, traits> > extnumput; bool failed = std::use_facet<extnumput>(os.getloc()).put(*this, *this, os.fill(), d).failed(); if (failed) { os.setstate(std::ios_base::failbit); }If an exception is thrown during output then
std::ios::badbit
is set in the error state of the input stream os. If(os.exceptions() & std::ios_base::badbit) != 0
then the exception is rethrown. In any case, the formatted output function destroys thesentry
object.Returns: os.
<limits>
The standard template std::numeric_limits
shall be specialized for the decimal128
type.
[Example:
namespace std { template<> class numeric_limits<dfp::decimal128> { public: static const bool is_specialized = true; static dfp::decimal128 min() throw() { return DEC128_MIN; } static dfp::decimal128 max() throw() { return DEC128_MIN; } static const int digits = 384; static const int digits10 = digits; static const int max_digits10 = digits; static const bool is_signed = true; static const bool is_integer = false; static const bool is_exact = false; static const int radix = 10; static dfp::decimal128 epsilon() throw() { return DEC128_EPSILON; } static dfp::decimal128 round_error() throw() { return ...; } static const int min_exponent = -6143; static const int min_exponent10 = min_exponent; static const int max_exponent = 6144; static const int max_exponent10 = max_exponent; static const bool has_infinity = true; static const bool has_quiet_NaN = true; static const bool has_signaling_NaN = true; static const float_denorm_style has_denorm = denorm_present; static const bool has_denorm_loss = true; static dfp::decimal128 infinity() throw() { return ...; } static dfp::decimal128 quiet_NaN() throw() { return ...; } static dfp::decimal128 signaling_NaN() throw() { return ...; } static dfp::decimal128 denorm_min() throw() { return DEC128_DEN; } static const bool is_iec559 = false; static const bool is_bounded = true; static const bool is_modulo = false; static const bool traps = true; static const bool tinyness_before = true; static const float_round_style round_style = round_indeterminate; }; }
--end example]
<cdecfloat>
and <decfloat.h>
The standard C++ headers <cfloat>
and <float.h>
define characteristics of the floating-point types float
, double
, and long double
. Their contents remain unchanged by this Technical Report.
Headers <cdecfloat>
and <decfloat.h>
define characteristics of the decimal floating-point types decimal32
, decimal64
, and decimal128
. As well, <decfloat.h>
defines the convenience typedefs _Decimal32
, _Decimal64
, and _Decimal128
, for compatibilty with the C programming language.
<cdecfloat>
synopsis
#include <dec32> #include <dec64> #include <dec128> // number of digits in the coefficient: #define DEC32_MANT_DIG 7 #define DEC64_MANT_DIG 16 #define DEC64_MANT_DIG 34 // minimum exponent: #define DEC32_MIN_EXP -95 #define DEC64_MIN_EXP -383 #define DEC128_MIN_EXP -6143 // maximum exponent: #define DEC32_MIN_EXP 96 #define DEC64_MIN_EXP 384 #define DEC128_MIN_EXP 6144 // 3.5.3 maximum finite value: #define DEC32_MAX implementation-defined #define DEC64_MAX implementation-defined #define DEC128_MAX implementation-defined // 3.5.4 epsilon: #define DEC32_EPSILON implementation-defined #define DEC64_EPSILON implementation-defined #define DEC128_EPSILON implementation-defined // 3.5.5 minimum positive normal value: #define DEC32_MIN implementation-defined #define DEC64_MIN implementation-defined #define DEC128_MIN implementation-defined // 3.5.6 minimum positive subnormal value: #define DEC32_DEN implementation-defined #define DEC64_DEN implementation-defined #define DEC128_DEN implementation-defined // 3.5.7 evaluation format: #define DEC_EVAL_METHOD implementation-defined
<decfloat.h>
synopsis
#include <cdecfloat> // C-compatibility convenience typedefs: typedef std::dfp::decimal32 _Decimal32; typedef std::dfp::decimal64 _Decimal64; typedef std::dfp::decimal128 _Decimal128;
#define DEC32_MAX implementation-defined
Expansion: an lvalue of type
decimal32
equal to the maximum finite number that can be represented by an object of typedecimal32
; exactly equal to 9.999999 x 1096 (there are six 9's after the decimal point)
#define DEC64_MAX implementation-defined
Expansion: an lvalue of type
decimal64
equal to the maximum finite number that can be represented by an object of typedecimal64
; exactly equal to 9.999999999999999 x 10384 (there are fifteen 9's after the decimal point)
#define DEC128_MAX implementation-defined
Expansion: an lvalue of type
decimal128
equal to the maximum finite number that can be represented by an object of typedecimal128
; exactly equal to 9.999999999999999999999999999999999 x 106144 (there are thirty-three 9's after the decimal point)
#define DEC32_EPSILON implementation-defined
Expansion: an lvalue of type
decimal32
equal to the difference between 1 and the least value greater than 1 that can be represented by an object of typedecimal32
; exactly equal to 1 x 10-6
#define DEC64_EPSILON implementation-defined
Expansion: an lvalue of type
decimal64
equal to the difference between 1 and the least value greater than 1 that can be represented by an object of typedecimal64
; exactly equal to 1 x 10-15
#define DEC128_EPSILON implementation-defined
Expansion: an lvalue of type
decimal128
equal to the difference between 1 and the least value greater than 1 that can be represented by an object of typedecimal128
; exactly equal to 1 x 10-33
#define DEC32_MIN implementation-defined
Expansion: an lvalue of type
decimal32
equal to the minimum positive normal number that can be represented by an object of typedecimal32
; exactly equal to 1 x 10-95
#define DEC64_MIN implementation-defined
Expansion: an lvalue of type
decimal64
equal to the minimum positive normal number that can be represented by an object of typedecimal64
; exactly equal to 1 x 10-383
#define DEC128_MIN implementation-defined
Expansion: an lvalue of type
decimal128
equal to the minimum positive normal number that can be represented by an object of typedecimal128
; exactly equal to 1 x 10-6143
#define DEC32_DEN implementation-defined
Expansion: an lvalue of type
decimal32
equal to the minimum positive finite number that can be represented by an object of typedecimal32
; exactly equal to 1 x 10-101
#define DEC64_DEN implementation-defined
Expansion: an lvalue of type
decimal64
equal to the minimum positive finite number that can be represented by an object of typedecimal64
; exactly equal to 1 x 10-398
#define DEC128_DEN implementation-defined
Expansion: an lvalue of type
decimal128
equal to the minimum positive finite number that can be represented by an object of typedecimal128
; exactly equal to 1 x 10-6176
#define DEC_EVAL_METHOD implementation-defined
Except for assignment and casts, the values of operations with decimal floating operands and values subject to the usual arithmetic conversions are evaluated to a format whose range and precision may be greater than required by the type. The use of evaluation formats is characterized by the implementation-defined value of DEC_EVAL_METHOD
:
-1 indeterminable; 0 evaluate all operations and constants just to the range and precision of the type; 1 evaluate operations and constants of typedecimal32
anddecimal64
to the range and precision of thedecimal64
type, evaluatedecimal128
operations and constants to the range and precision of thedecimal128
type; 2 evaluate all operations and constants to the range and precision of thedecimal128
type.
All other negative values for DEC_EVAL_METHOD
characterize implementation-defined behavior.
<cfenv>
and <fenv.h>
The header <cfenv>
is described in [tr.c99.cfenv]. The header <fenv.h>
is described in [tr.c99.fenv]. The floating point environment specified in these clauses is extended by this Technical Report to apply to decimal floating-point types.
<cfenv>
synopsis
// 3.6.2 rounding direction macros: #define FE_DEC_DOWNWARD implementation-defined #define FE_DEC_TONEAREST implementation-defined #define FE_DEC_TONEARESTFROMZERO implementation-defined #define FE_DEC_TOWARD_ZERO implementation-defined #define FE_DEC_UPWARD implementation-defined namespace std { namespace dfp { // 3.6.3 fe_dec_getround function: int fe_dec_getround(); // 3.6.4 fe_dec_setround function: int fe_dec_setround(int round); } }
Macros are added to <cfenv>
and <fenv.h>
:
Additional DFP rounding direction macros introduced by this Technical Report |
Equivalent TR1 macro for generic floating types |
IEEE-754 |
---|---|---|
FE_DEC_DOWNWARD | FE_DOWNWARD | Towards minus infinity |
FE_DEC_TONEAREST | FE_TONEAREST | To nearest, ties even |
FE_DEC_TONEARESTFROMZERO | n/a | To nearest, ties away from zero |
FE_DEC_TOWARD_ZERO | FE_TOWARD_ZERO | Toward zero |
FE_DEC_UPWARD | FE_UPWARD | Toward plus infinity |
These macros are used by the fegetround
and fesetround
functions for getting and setting the rounding mode to be used in decimal floating-point operations.
fe_dec_getround
function
int fe_dec_getround();
Effects: gets the current rounding direction for decimal floating-point operations.
Returns: the value of the rounding direction macro representing the current rounding direction for decimal floating-point operations, or a negative value if there is no such rounding macro or the current rounding direction is not determinable.
fe_dec_setround
function
int fe_dec_setround(int round);
Effects: establishes round as the rounding direction for decimal floating-point operations. If round is not equal to the value of a DFP rounding direction macro, the rounding direction is not changed.
Returns: a zero value if and only if the argument is equal to one of the rounding direction macros introduced in 3.6.2.
<fenv.h>
Each name placed into the namespace dfp
by <cfenv>
is placed into both the namespace dfp
and the global namespace by <fenv.h>
.
<cmath>
and <math.h>
The elementary mathematical functions declared in the standard C++ header <cmath>
are overloaded by this Technical Report to support the decimal floating-point types. The macros HUGE_VAL_D32
, HUGE_VAL_D64
, HUGE_VAL_D128
, DEC_INFINITY
, and DEC_NAN
are defined for use with these functions. With the exception of sqrt
, fmax
, and fmin
, the accuracy of the result of a call to one of these functions is implementation-defined. The implementation may state that the accuracy is unknown. The TR1 function templates signbit
, fpclassify
, isinfinite
, isinf
, isnan
, isnormal
, isgreater
, isgreaterequal
, isless
, islessequal
, islessgreater
, and isunordered
are also extended by this Technical Report to handle the decimal floating-point types.
<cmath>
synopsis
// 3.7.2 macros: #define HUGE_VAL_D32 implementation-defined #define HUGE_VAL_D64 implementation-defined #define HUGE_VAL_D128 implementation-defined #define DEC_INFINITY implementation-defined #define DEC_NAN implementation-defined #define FP_FAST_FMAD32 implementation-defined #define FP_FAST_FMAD64 implementation-defined #define FP_FAST_FMAD128 implementation-defined namespace std { namespace dfp { // 3.7.3 evaluation formats: typedef decimal-floating-type decimal32_t; typedef decimal-floating-type decimal64_t; // 3.7.4 samequantum functions: bool samequantum (decimal32 x, decimal32 y); bool samequantumd32 (decimal32 x, decimal32 y); bool samequantum (decimal64 x, decimal64 y); bool samequantumd64 (decimal64 x, decimal64 y); bool samequantum (decimal128 x, decimal128 y); bool samequantumd128 (decimal128 x, decimal128 y); // 3.7.5 quantize functions: decimal32 quantize (decimal32 x, decimal32 y); decimal32 quantized32 (decimal32 x, decimal32 y); decimal64 quantize (decimal64 x, decimal64 y); decimal64 quantized64 (decimal64 x, decimal64 y); decimal128 quantize (decimal128 x, decimal128 y); decimal128 quantized128 (decimal128 x, decimal128 y); // 3.7.6 elementary functions: // trigonometric functions: decimal32 acosd32 (decimal32 x); decimal64 acosd64 (decimal64 x); decimal128 acosd128 (decimal128 x); decimal32 asind32 (decimal32 x); decimal64 asind64 (decimal64 x); decimal128 asind128 (decimal128 x); decimal32 atand32 (decimal32 x); decimal64 atand64 (decimal64 x); decimal128 atand128 (decimal128 x); decimal32 atan2d32 (decimal32 x, decimal32 y); decimal64 atan2d64 (decimal64 x, decimal64 y); decimal128 atan2d128 (decimal128 x, decimal128 y); decimal32 cosd32 (decimal32 x); decimal64 cosd64 (decimal64 x); decimal128 cosd128 (decimal128 x); decimal32 sind32 (decimal32 x); decimal64 sind64 (decimal64 x); decimal128 sind128 (decimal128 x); decimal32 tand32 (decimal32 x); decimal64 tand64 (decimal64 x); decimal128 tand128 (decimal128 x); // hyperbolic functions: decimal32 acoshd32 (decimal32 x); decimal64 acoshd64 (decimal64 x); decimal128 acoshd128 (decimal128 x); decimal32 asinhd32 (decimal32 x); decimal64 asinhd64 (decimal64 x); decimal128 asinhd128 (decimal128 x); decimal32 atanhd32 (decimal32 x); decimal64 atanhd64 (decimal64 x); decimal128 atanhd128 (decimal128 x); decimal32 coshd32 (decimal32 x); decimal64 coshd64 (decimal64 x); decimal128 coshd128 (decimal128 x); decimal32 sinhd32 (decimal32 x); decimal64 sinhd64 (decimal64 x); decimal128 sinhd128 (decimal128 x); decimal32 tanhd32 (decimal32 x); decimal64 tanhd64 (decimal64 x); decimal128 tanhd128 (decimal128 x); // exponential and logarithmic functions: decimal32 expd32 (decimal32 x); decimal64 expd64 (decimal64 x); decimal128 expd128 (decimal128 x); decimal32 exp2d32 (decimal32 x); decimal64 exp2d64 (decimal64 x); decimal128 exp2d128 (decimal128 x); decimal32 expm1d32 (decimal32 x); decimal64 expm1d64 (decimal64 x); decimal128 expm1d128 (decimal128 x); decimal32 frexpd32 (decimal32 value, int * exp); decimal64 frexpd64 (decimal64 value, int * exp); decimal128 frexpd128 (decimal128 value, int * exp); int ilogbd32 (decimal32 x); int ilogbd64 (decimal64 x); int ilogbd128 (decimal128 x); decimal32 ldexpd32 (decimal32 x, int exp); decimal64 ldexpd64 (decimal64 x, int exp); decimal128 ldexpd128 (decimal128 x, int exp); decimal32 logd32 (decimal32 x); decimal64 logd64 (decimal64 x); decimal128 logd128 (decimal128 x); decimal32 log10d32 (decimal32 x); decimal64 log10d64 (decimal64 x); decimal128 log10d128 (decimal128 x); decimal32 log1pd32 (decimal32 x); decimal64 log1pd64 (decimal64 x); decimal128 log1pd128 (decimal128 x); decimal32 log2d32 (decimal32 x); decimal64 log2d64 (decimal64 x); decimal128 log2d128 (decimal128 x); decimal32 logbd32 (decimal32 x); decimal64 logbd64 (decimal64 x); decimal128 logbd128 (decimal128 x); decimal32 modfd32 (decimal32 value, decimal32 * iptr); decimal64 modfd64 (decimal64 value, decimal64 * iptr); decimal32 modfd128 (decimal128 value, decimal128 * iptr); decimal32 scalbnd32 (decimal32 x, int n); decimal64 scalbnd64 (decimal64 x, int n); decimal128 scalbnd128 (decimal128 x, int n); decimal32 scalblnd32 (decimal32 x, long int n); decimal64 scalblnd64 (decimal64 x, long int n); decimal128 scalblnd128 (decimal128 x, long int n); // power and absolute-value functions: decimal32 cbrtd32 (decimal32 x); decimal64 cbrtd64 (decimal64 x); decimal128 cbrtd128 (decimal128 x); decimal32 fabsd32 (decimal32 x); decimal64 fabsd64 (decimal64 x); decimal128 fabsd128 (decimal128 x); decimal32 hypotd32 (decimal32 x, decimal32 y); decimal64 hypotd64 (decimal64 x, decimal64 y); decimal128 hypotd128 (decimal128 x, decimal128 y); decimal32 powd32 (decimal32 x, decimal32 y); decimal64 powd64 (decimal64 x, decimal64 y); decimal128 powd128 (decimal128 x, decimal128 y); decimal32 sqrtd32 (decimal32 x); decimal64 sqrtd64 (decimal64 x); decimal128 sqrtd128 (decimal128 x); // error and gamma functions: decimal32 erfd32 (decimal32 x); decimal64 erfd64 (decimal64 x); decimal128 erfd128 (decimal128 x); decimal32 erfcd32 (decimal32 x); decimal64 erfcd64 (decimal64 x); decimal128 erfcd128 (decimal128 x); decimal32 lgammad32 (decimal32 x); decimal64 lgammad64 (decimal64 x); decimal128 lgammad128 (decimal128 x); decimal32 tgammad32 (decimal32 x); decimal64 tgammad64 (decimal64 x); decimal128 tgammad128 (decimal128 x); // nearest integer functions: decimal32 ceild32 (decimal32 x); decimal64 ceild64 (decimal64 x); decimal128 ceild128 (decimal128 x); decimal32 floord32 (decimal32 x); decimal64 floord64 (decimal64 x); decimal128 floord128 (decimal128 x); decimal32 nearbyintd32 (decimal32 x); decimal64 nearbyintd64 (decimal64 x); decimal128 nearbyintd128 (decimal128 x); decimal32 rintd32 (decimal32 x); decimal64 rintd64 (decimal64 x); decimal128 rintd128 (decimal128 x); long int lrintd32 (decimal32 x); long int lrintd64 (decimal64 x); long int lrintd128 (decimal128 x); long long int llrintd32 (decimal32 x); long long int llrintd64 (decimal64 x); long long int llrintd128 (decimal128 x); decimal32 roundd32 (decimal32 x); decimal64 roundd64 (decimal64 x); decimal128 roundd128 (decimal128 x); long int lroundd32 (decimal32 x); long int lroundd64 (decimal64 x); long int lroundd128 (decimal128 x); long long int llroundd32 (decimal32 x); long long int llroundd64 (decimal64 x); long long int llroundd128 (decimal128 x); decimal32 truncd32 (decimal32 x); decimal64 truncd64 (decimal64 x); decimal128 truncd128 (decimal128 x); // remainder functions: decimal32 fmodd32 (decimal32 x, decimal32 y); decimal64 fmodd64 (decimal64 x, decimal64 y); decimal128 fmodd128 (decimal128 x, decimal128 y); decimal32 remainderd32 (decimal32 x, decimal32 y); decimal64 remainderd64 (decimal64 x, decimal64 y); decimal128 remainderd128 (decimal128 x, decimal128 y); decimal32 remquod32 (decimal32 x, decimal32 y, int * quo); decimal64 remquod64 (decimal64 x, decimal64 y, int * quo); decimal128 remquod128 (decimal128 x, decimal128 y, int * quo); // manipulation functions: decimal32 copysignd32 (decimal32 x, decimal32 y); decimal64 copysignd64 (decimal64 x, decimal64 y); decimal128 copysignd128 (decimal128 x, decimal128 y); decimal32 nand32 (const char * tagp); decimal64 nand64 (const char * tagp); decimal128 nand128 (const char * tagp); decimal32 nextafterd32 (decimal32 x, decimal32 y); decimal64 nextafterd64 (decimal64 x, decimal64 y); decimal128 nextafterd128 (decimal128 x, decimal128 y); decimal32 nexttowardd32 (decimal32 x, decimal32 y); decimal64 nexttowardd64 (decimal64 x, decimal64 y); decimal128 nexttowardd128 (decimal128 x, decimal128 y); // maximum, minimum, and positive difference functions: decimal32 fdimd32 (decimal32 x, decimal32 y); decimal64 fdimd64 (decimal64 x, decimal64 y); decimal128 fdimd128 (decimal128 x, decimal128 y); decimal32 fmaxd32 (decimal32 x, decimal32 y); decimal64 fmaxd64 (decimal64 x, decimal64 y); decimal128 fmaxd128 (decimal128 x, decimal128 y); decimal32 fmind32 (decimal32 x, decimal32 y); decimal64 fmind64 (decimal64 x, decimal64 y); decimal128 fmind128 (decimal128 x, decimal128 y); // floating multiply-add: decimal32 fmad32 (decimal32 x, decimal32 y, decimal32 z); decimal64 fmad64 (decimal64 x, decimal64 y, decimal64 z); decimal128 fmad128 (decimal128 x, decimal128 y, decimal128 z); // 3.7.6.1 abs function overloads decimal32 abs(decimal32 d); decimal64 abs(decimal64 d); decimal128 abs(decimal128 d); } }
<cmath>
macros
#define HUGE_VAL_D32 implementation-defined
Expansion: a positive lvalue of type
decimal32
.
#define HUGE_VAL_D64 implementation-defined
Expansion: a positive lvalue of type
decimal64
, not necessarily representable as adecimal32
.
#define HUGE_VAL_128 implementation-defined
Expansion: a positive lvalue of type
decimal128
, not necessarily representable as adecimal64
.
#define DEC_INFINITY implementation-defined
Expansion: an lvalue of type
decimal32
representing infinity.
#define DEC_NAN implementation-defined
Expansion: an lvalue of type
decimal32
representing quiet NaN.
#define FP_FAST_FMAD32 implementation-defined #define FP_FAST_FMAD64 implementation-defined #define FP_FAST_FMAD128 implementation-defined
Effects: these macros are, respectively,
decimal32
,decimal64
, anddecimal128
analogs ofFP_FAST_FMA
in C99, subclause 7.12.
typedef decimal-floating-type decimal32_t; typedef decimal-floating-type decimal64_t;
The types decimal32_t
and decimal64_t
are decimal floating types at least as wide as decimal32
and decimal64
, respectively, and such that decimal64_t
is at least as wide as decimal32_t
. If DEC_EVAL_METHOD
equals 0, decimal32_t
and decimal64_t
are decimal32
and decimal64
, respectively; if DEC_EVAL_METHOD
equals 1, they are both decimal64
; if DEC_EVAL_METHOD
equals 2, they are both decimal128
; and for other values of DEC_EVAL_METHOD
, they are otherwise implementation-defined.
samequantum
functions
bool samequantumd32 (decimal32 x, decimal32 y); bool samequantumd64 (decimal64 x, decimal64 y); bool samequantumd128 (decimal128 x, decimal128 y);
Effects: determines if the representation exponents of x and y are the same. If both x and y is NaN, or infinity, they have the same representation exponents; if exactly one operand is infinity or exactly one operand is NaN, they do not have the same representation exponents. The samequantum functions raise no exception.
Returns:
true
when x and y have the same representation exponents,false
otherwise.
bool samequantum (decimal32 x, decimal32 y);
Returns:
samequantumd32(x, y)
bool samequantum (decimal64 x, decimal64 y);
Returns:
samequantumd64(x, y)
bool samequantum (decimal128 x, decimal128 y);
Returns:
samequantumd128(x, y)
quantize
functions
decimal32 quantized32 (decimal32 x, decimal32 y); decimal64 quantized64 (decimal64 x, decimal64 y); decimal128 quantized128 (decimal128 x, decimal128 y);
Effects: sets the exponent of argument x to the exponent of argument y. If the exponent is being increased, the value is correctly rounded according to the current rounding mode; if the result does not have the same value as x, the "inexact" floating-point exception is raised. If the exponent is being decreased and the significand of the result has more digits than the type would allow, the "invalid" floating-point exception is raised and the result is NaN. If one or both operands are NaN the result is NaN. Otherwise if only one operand is infinity, the "invalid" floating-point exception is raised and the result is NaN. If both operands are infinity, the result is DEC_INFINITY. The quantize functions do not signal underflow. Whether the quantize functions signal overflow is implementation-defined.
Returns: the number which is equal in value (except for any rounding) and sign to x, and which has an exponent set to be equal to the exponent of y.
decimal32 quantize (decimal32 x, decimal32 y);
Returns:
quantized32(x, y)
decimal64 quantize (decimal64 x, decimal64 y);
Returns:
quantized64(x, y)
decimal128 quantize (decimal128 x, decimal128 y);
Returns:
quantized128(x, y)
For each of the following standard elementary functions from <cmath>
,
acos ceil floor log sin tanh asin cos fmod log10 sinh atan cosh frexp modf sqrt atan2 fabs ldexp pow tan
and for each of the following TR1 elementary functions from <cmath>
:
acosh expm1 llround nexttoward asinh fdim lrint remainder atanh fma lround remquo cbrt fmax log1p rint copysign fmin log2 round erf hypot logb scalbn erfc ilogb nan scalbln exp lgamma nearbyint tgamma exp2 llrint nextafter trunc
std::dfp
with the name funcd32
, where func is the name of the original function; all parameters of type double
in the original are replaced with type decimal32
in the new function; all parameters of type double *
are replaced with type decimal32 *
; if the return type of the original function is double
, the return type of the new function is decimal32
; the specification of the behavior of the new function is otherwise equivalent to that of the original functiond32
, described abovestd::dfp
with the name funcd64
, where func is the name of the original function; all parameters of type double
in the original are replaced with type decimal64
in the new function; all parameters of type double *
are replaced with type decimal64 *
; if the return type of the original function is double
, the return type of the new function is decimal64
; the specification of the behavior of the new function is otherwise equivalent to that of the original functiond64
, described abovestd::dfp
with the name funcd128
, where func is the name of the original function; all parameters of type double
in the original are replaced with type decimal128
in the new function; all parameters of type double *
are replaced with type decimal128 *
; if the return type of the original function is double
, the return type of the new function is decimal128
; the specification of the behavior of the new function is otherwise equivalent to that of the original functiond128
, described aboveMoreover, there shall be additional overloads of the original function func, declared in func's namespace, sufficient to ensure:
decimal64
parameter has type decimal128
, then all arguments of decimal floating-point type or integer type corresponding to decimal64
parameters are effectively cast to decimal128
.decimal64
parameter has type decimal64
, then all other arguments of decimal floating-type or integer-type corresponding to decimal64
parameters are effectively cast to decimal64
.decimal64
parameter has type decimal32
, then all other arguments of decimal floating-type or integer-type corresponding to decimal64
parameters are effectively cast to decimal32
.
[Editor's note: The combination of TR1 8.16.4/4 and the above dictates that the first argument to the following function call should be converted to type double
: pow(decimal64(), double())
. However, there is no implicit conversion from decimal64
to double
, so the function call will be ill-formed. I view this as a good thing.]
abs
function overloads
decimal32 abs(decimal32 d); decimal64 abs(decimal64 d); decimal128 abs(decimal128 d);
Returns:
fabs(d)
<math.h>
The header behaves as if it includes the header <cmath>
, and provides sufficient additional using declarations to declare in the global namespace all the additional function and type names introduced by this Technical Report to the header <cmath>
.
<math.h>
synopsis
// C-compatibility convenience macros: #define _Decimal32_t std::dfp::decimal32_t #define _Decimal64_t std::dfp::decimal64_t
<cstdio>
and <stdio.h>
This Technical Report introduces the following formatted input/output specifiers for fprintf
, fscanf
, and related functions declared in <cstdio>
and <stdio.h>
:
H Specifies that any following e, E, f, F, g, or G conversions specifier applies to a decimal32 argument. D Specifies that any following e, E, f, F, g, or G conversions specifier applies to a decimal64 argument. DD Specifies that any following e, E, f, F, g, or G conversions specifier applies to a decimal128 argument.
<cstdlib>
and <stdlib.h>
<cstdlib>
synopsis
namespace std { namespace dfp { // 3.9.2 strtod functions: decimal32 strtod32 (const char * nptr, char ** endptr); decimal64 strtod64 (const char * nptr, char ** endptr); decimal128 strtod128 (const char * nptr, char ** endptr); } }
strtod
functions
These functions behave as specified in subclause 9.4 of ISO/IEC TR 24732.
<stdlib.h>
Each name placed into the namespace dfp
by <cstdlib>
is placed into both the namespace dfp
and the global namespace by <stdlib.h>
.
<cwchar>
and <wchar.h>
<cwchar>
synopsis
namespace std { namespace dfp { // 3.10.2 wcstod functions: decimal32 wcstod32 (const char * nptr, char ** endptr); decimal64 wcstod64 (const char * nptr, char ** endptr); decimal128 wcstod128 (const char * nptr, char ** endptr); } }
wcstod
functions
These functions behave as specified in subclause 9.5 of ISO/IEC TR 24732.
<wchar.h>
Each name placed into the namespace dfp
by <cwchar>
is placed into both the namespace dfp
and the global namespace by <wchar.h>
.
This Technical Report introduces the locale facet templates extended_num_get
and extended_num_put
. For any locale loc
either constructed, or returned by locale::classic()
, and any facet Facet
that is one of the required instantiations indicated in Table 3, std::has_facet<Facet>(loc)
is true
. Each std::locale
member function that has a parameter cat
of type std::locale::category
operates on the these facets when cat & std::locale::numeric != 0
.
Category | Facets |
numeric
|
extended_num_get<char> , extended_num_get<wchar_t> extended_num_get<char> , extended_num_put<wchar_t>
|
<locale>
synopsis
namespace std { namespace dfp { // 3.11.2 extended_num_get facet: template <class charT, class InputIterator> class extended_num_get; // 3.11.3 extended_num_put facet: template <class charT, class OutputIterator> class extended_num_put; } }
extended_num_get
namespace std { namespace dfp { template <class charT, class InputIterator = std::istreambuf_iterator<charT, std::char_traits<charT> > > class extended_num_get : public std::locale::facet { public: typedef charT char_type; typedef InputIterator iter_type; explicit extended_num_get(size_t refs = 0); extended_num_get(const std::num_get<charT, InputIterator> & b, size_t refs = 0); iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, decimal32 & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, decimal64 & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, decimal128 & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, bool & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, long & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, unsigned short & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, unsigned int & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, unsigned long & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, float & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, double & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, long double & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, void * & val) const; static std::locale::id id; protected: ~extended_num_get(); // virtual virtual iter_type do_get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal32 & val) const; virtual iter_type do_get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal64 & val) const; virtual iter_type do_get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal128 & val) const; // const std::num_get<charT, InputIterator> & base; exposition only }; } }
extended_num_get
members
explicit extended_num_get(size_t refs = 0);
Effects: Constructs an
extended_num_get
facet as if by:typedef std::num_get<charT, InputIterator> base_type; explicit extended_num_get(size_t refs = 0) : facet(refs), base(std::use_facet<base_type>(std::locale()) { /* ... */ }Notes: Care must be taken to ensure that the lifetime of the facet referenced by base exceeds that of the resulting
extended_num_get
facet.
extended_num_get(std::num_get<charT, InputIterator> & b, size_t refs = 0);
Effects: Constructs an
extended_num_get
facet as if by:extended_num_get(const std::num_get<charT, InputIterator> & b, size_t refs = 0) : facet(refs), base(b) { /* ... */ }Notes: Care must be taken to ensure that the lifetime of the facet referenced by base exceeds that of the resulting
extended_num_get
facet.
iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal32 & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal64 & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal128 & val) const;
Returns:
do_get(in, end, str, err, val)
.
iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, bool & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, long & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, unsigned short & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, unsigned int & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, unsigned long & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, float & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, double & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, long double & val) const; iter_type get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, void * & val) const;
Returns:
base.get(in, end, str, err, val)
.
extended_num_get
virtual functions
iter_type do_get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal32 & val) const; iter_type do_get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal64 & val) const; iter_type do_get(iter_type in, iter_type end, std::ios_base & str, std::ios_base::iostate & err, const decimal128 & val) const;
Effects: The input characters will be interpreted as described in [lib.facet.num.get.virtuals], and the resulting value will be stored in val. For conversions to type decimal32, decimal64, and decimal128, the conversion specifiers are
%gHD
,%gD
, and%gLD
, respectively.Returns: in.
extended_num_put
namespace std { namespace dfp { template <class charT, class OutputIterator = std::ostreambuf_iterator<charT, std::char_traits<charT> > > class extended_num_put : public std::locale::facet { public: typedef charT char_type; typedef OutputIterator iter_type; explicit extended_num_put(size_t refs = 0); extended_num_put(const std::num_put<charT, OutputIterator> & b, size_t refs = 0); iter_type put(iter_type s, ios_base & f, char_type fill, const decimal32 & val) const; iter_type put(iter_type s, ios_base & f, char_type fill, const decimal64 & val) const; iter_type put(iter_type s, ios_base & f, char_type fill, const decimal128 & val) const; iter_type put(iter_type s, ios_base & f, char_type fill, bool val) const; iter_type put(iter_type s, ios_base & f, char_type fill, long val) const; iter_type put(iter_type s, ios_base & f, char_type fill, unsigned long val) const; iter_type put(iter_type s, ios_base & f, char_type fill, double val) const; iter_type put(iter_type s, ios_base & f, char_type fill, long double val) const; iter_type put(iter_type s, ios_base & f, char_type fill, const void * val) const; static std::locale::id id; protected: ~extended_num_put(); // virtual virtual iter_type do_put(iter_type s, ios_base & f, char_type fill, const decimal32 & val) const; virtual iter_type do_put(iter_type s, ios_base & f, char_type fill, const decimal64 & val) const; virtual iter_type do_put(iter_type s, ios_base & f, char_type fill, const decimal128 & val) const; // const std::num_put<charT, OutputIterator> & base; exposition only }; } }
extended_num_put
members
explicit extended_num_put(size_t refs = 0);
Effects: Constructs an
extended_num_put
facet as if by:typedef std::num_put<charT, OutputIterator> base_type; explicit extended_num_put(size_t refs = 0) : facet(refs), base(std::use_facet<base_type>(std::locale()) { /* ... */ }Notes: Care must be taken to ensure that the lifetime of the facet referenced by base exceeds that of the resulting
extended_num_put
facet.
extended_num_put(std::num_put<charT, InputIterator> & b, size_t refs = 0);
Effects: Constructs an
extended_num_put
facet as if by:extended_num_put(const std::num_put<charT, InputIterator> & b, size_t refs = 0) : facet(refs), base(b) { /* ... */ }Notes: Care must be taken to ensure that the lifetime of the facet referenced by base exceeds that of the resulting
extended_num_put
facet.
iter_type put(iter_type s, ios_base & f, char_type fill, const decimal32 & val) const; iter_type put(iter_type s, ios_base & f, char_type fill, const decimal64 & val) const; iter_type put(iter_type s, ios_base & f, char_type fill, const decimal128 & val) const;
Returns:
do_put(s, f, fill, val)
.
iter_type put(iter_type s, ios_base & f, char_type fill, bool val) const; iter_type put(iter_type s, ios_base & f, char_type fill, long val) const; iter_type put(iter_type s, ios_base & f, char_type fill, unsigned long val) const; iter_type put(iter_type s, ios_base & f, char_type fill, double val) const; iter_type put(iter_type s, ios_base & f, char_type fill, long double val) const; iter_type put(iter_type s, ios_base & f, char_type fill, const void * val) const;
Returns:
base.put(s, f, fill, val)
.
extended_num_put
virtual functions
virtual iter_type do_put(iter_type s, ios_base & f, char_type fill, const decimal32 & val) const; virtual iter_type do_put(iter_type s, ios_base & f, char_type fill, const decimal64 & val) const; virtual iter_type do_put(iter_type s, ios_base & f, char_type fill, const decimal128 & val) const;
Effects: The number represented by val will be formatted for output as described in [lib.facet.num.put.virtuals]. A length modifier is added to the conversion specifier as indicated in Table 4.
Table 4 -- Length modifier
type length modifier decimal32 HD decimal64 D decimal128 LD Returns: out.
The effect of the following type traits, when applied to any of the decimal floating-point types, is implementation-defined:
std::tr1::is_arithmetic
std::tr1::is_fundamental
std::tr1::is_scalar
std::tr1::is_class
However, the following expression shall yield true
where dec is one of decimal32
, decimal64
, or decimal128
:
is_arithmetic<dec>::value == is_fundamental<dec>::value == is_scalar<dec>::value == !is_class<dec>::value
[Note: The behavior of the type trait std::tr1::is_floating_point
is not altered by this Technical Report. --end note]
<type_traits>
synopsis
namespace std { namespace dfp { // 3.12.2 is_decimal_floating_point type_trait: template <class T> struct is_decimal_floating_point; } }
is_decimal_floating_point
type_trait
[Editor's note: an earlier draft of this document used the name is_decimal_fp
for this type_trait. Though it's longer, the current name was adopted for consistency with the TR1 is_floating_point
type_trait.]
is_decimal_floating_point
is a UnaryTypeTrait [tr.meta.rqmts] and satisfies all of the requirements of that category [tr.meta.requirements].
Template | Condition | Comments |
---|---|---|
template <class T>
|
T is one of decimal32 , decimal64 , or decimal128
|
<functional>
synopsis
namespace std { namespace tr1 { // 3.13.2 Hash function specializations: template <> struct hash<dfp::decimal32>; template <> struct hash<dfp::decimal64>; template <> struct hash<dfp::decimal128>; } }
In addition to the types indicated in [tr.unord.hash], the class template hash
is required to be instantiable on the decimal floating-point types.
<string>
synopsis
namespace std { namespace tr1 { // 3.14.2 Numeric conversions: decimal32 stod32 (string & str); decimal32 stod64 (string & str); decimal32 stod128 (string & str); string to_string(const decimal128 & val); } }
decimal32 stod32 (string & str); decimal32 stod64 (string & str); decimal32 stod128 (string & str);
Effects: the functions call
strtod32(str.c_str())
,strtod64(str.c_str())
, andstrtod128(str.c_str())
, respectively. Each function returns the converted result, if any, and erases the characters from the front ofstr
that were converted to get the result.Returns: the converted result.
Throws:
invalid_argument
ifstrtod32
,strtod64
, orstrtod128
reports that no conversion could be performed. Throwsout_of_range
ifstrtod32
,strtod64
, orstrtod128
setserrno
toERANGE
.
string to_string(const decimal128 & val);
Returns: a
string
object holding the character representation of the value of its argument that would be generated by callingsprintf(buf, fmt, val)
with a format specifier of"%LD"
.Throws: nothing.
One of the goals of the design of the decimal floating-point types that are the subject of this Technical Report is to minimize incompatibility with the C decimal floating types; however, differences between the C and C++ languages make some incompatibilty inevitable. Differences between the C and C++ decimal types -- and techniques for overcoming them -- are described in this section.
<decfloat.h>
To aid portability to C++, it is recommended that C programmers #include
the header file <decfloat.h>
in those translation units that make use of the decimal floating types. This ensures that the equivalent C++ floating-point types will be available, should the program source be ported to C++.
Literals of decimal floating-point type are not introduced to the C++ language by this Technical Report, though implementations may support them as a conforming extension. C programs that use decimal floating-point literals will not be portable to a C++ implementation that does not support this extension.
In C, objects of decimal floating-point type can be converted to generic floating-point type by means of an explicit cast. In C++ this is not possible. Instead, the functions decimal_to_long_double
, decimal32_to_long_double
, decimal64_to_long_double
, and decimal128_to_long_double
should be used for this purpose. C programmers who wish to maintain portability to C++ should use the decimal32_to_long_double
, decimal64_to_long_double
, and decimal128_to_long_double
forms instead of the cast notation.
{Editor's note: there's another issue that I believe requires further discussion. Currently, decimal values that are within the range of an unsigned long long
but not within the range of a long long
cannot be accurately converted to integral type in C++, though accurate conversion is possible in C.]