Document Number: P0828R1
Date: 2018-06-25
Reply-to: John McFarlane
Audience: SG6, SG12, SG14
This paper proposes a numeric type, elastic_integer
, which returns widened results from common arithmetic operations in order to prevent out-of-range errors. It is highly composable and can be combined with other types such as overflow-trapping and fixed-point numeric types to provide arithmetic with all of the efficiency of integers and some of the benefit traditionally enjoyed by dynamic types such as bignum and floating-point.
Fundamental integer types are usually successful in representing quantities. When they are unsuccessful, it is often due to operations producing results with out-of-range values.
auto positive_unsigned_overflow = UINT_MAX + 1; // 0U
auto positive_signed_overflow = INT_MAX + 1; // undefined behavior
auto negative_unsigned_overflow = 0U > -1; // false
It is difficult and costly to detect and correct such errors. As always, prevention is better than cure:
auto positive_unsigned_ok = static_cast<unsigned long long>(UINT_MAX) + 1;
auto positive_signed_ok = static_cast<long long>(INT_MAX) + 1;
auto negative_unsigned_ok = static_cast<long long>(0U) > -1;
This solution places the responsibility for managing range firmly with the user, a task which is made especially onerous for two reasons.
Firstly, the solution is not generic: in a setting where the integer type is a template parameter, the user cannot easily call upon the next widest integer by type.
Secondly, difference in the width of integers on different architectures means calling upon the next widest integer type is not even foolproof in non-generic code. Indeed, the above casts may not increase range on platforms where sizeof(long long) == sizeof(int)
. (Fixed-width integer types do not completely solve this problem as they are aliases to types which may be wider than advertised.)
Portable, generic code for avoiding integer overflow requires facilities to manipulate two properties of integers directly affecting range: signedness and number of digits. is_signed
, make_signed
and make_unsigned
supported the manipulation of signedness for fundamental integers types only. The manipulation of the number of digits is even less well supported.
P0102 and P0675 both propose facilities for requesting an integer by its width or its number of digits, respectively. P0102 is limited to fundamental arithmetic types whereas P0675 caters for user-defined types.
P0675's num_digits_v
and set_num_digits_t
components make it possible to widen correctly for arithmetic operations:
template<typename Operand>
auto multiply(Operand a, Operand b) {
// Get the number of digits in the input type.
constexpr auto operand_digits = num_digits_v<Operand>;
// Results of multiplication contain twice the number of digits.
constexpr auto result_digits = operand_digits * 2;
// Use this to determine the result type.
using result_type = set_num_digits_t<Operand, result_digits>;
// Promoted operands before operation.
return static_cast<result_type>(a) * static_cast<result_type>(b);
}
Two limitations with this approach are:
Both of the above problems can be solved with a numeric type, elastic_integer
(reference implementation, live demo) that parametrizes signedness and number of digits:
template<int Digits, class Narrowest = int>
class elastic_integer;
Where:
Digits
specifies the number of binary digits used to express magnitude (excluding sign bit). It corresponds directly to the num_digits_v
variable template and set_num_digits_t
class template from P0675. Digits
must be a positive value.Narrowest
specifies the narrowest type used to store the value. If Digits
exceeds the capacity of Narrowest
, a wider type is substituted to ensure that the value can be represented.For two's complement signed types, the most negative number is not in the range allowed by the type. This avoids certain edge cases, e.g. where -INT_MIN
exceeds INT_MAX
, thus requiring an entire extra digit.
elastic_integer
automates the work of tracking range. Thus, the user is less likely to experience out-of-range errors:
auto d(elastic_integer<15> x, elastic_integer<15> y) // [-32767..32767]
{
// When two values are multiplied, result is the sum of the `Digits` parameters.
auto xx = x*x, yy = y*y; // elastic_integer<30>
// When two values are summed, result is the `Digits` parameter plus one.
auto dd = xx + yy; // elastic_integer<31>
// A square root operation almost halves the number of digits required
auto d = square_root(dd); // elastic_integer<31> in range [0..46339]
// so we know it's safe to cast to a narrower type.
return elastic_integer<16, unsigned>(d);
}
elastic_integer
can be initialized from any integer type which employs the customization points laid out in P0675. For example:
auto a = elastic_integer(123); // commonly deduced as elastic_integer<31>
auto b = elastic_integer(UINT64_C(4096)); // commonly deduced as elastic_integer<64, unsigned>
With general-purpose constant value type such as those proposed in P0377 and P0827 an optimized Digits
parameter can be determined:
auto a = elastic_integer(constant<123>()); // deduced as elastic_integer<7>
auto b = elastic_integer(constant<4096>()); // deduced as elastic_integer<13>
The addition of a user-defined literal for generating constant value types further improves terseness:
auto a = elastic_integer(123static); // deduced as elastic_integer<7>
auto b = elastic_integer(4096static); // deduced as elastic_integer<13>
This proposal adds header, <elastic_integer>
, containing class template, elastic_integer
and a collection of operator overload function templates which only participate in overload resolution when either of the operands is of type, elastic_integer
.
The usefulness of elastic_integer
is greatly enhanced with a number of additions. In particular, P0675 allows elastic_integer
to be composed with other types and P0827 adds a number of facilities to elastic_integer
which make it more efficient and usable.
The design of elastic_integer
arises from observations of the fixed_point
type from P0037 when compared with the fixed-point types proposed by Lawrence Crowl in P0106. P0106's fixed-point types -- not only approximate real numbers using integer arithmetic but also -- address a variety of concerns related to accuracy and safety such as rounding and out-of-range errors. In response to these observations, P0554 argues for an alternative approach in which individual concerns are addressed by individual numeric components and those components are used to compose a composite type which solves the full list problems identified in P0106.
One such problem addressed in P0106 is that of preventing out-of-range conditions. The solution there is the same elasticity property adopted by elastic_integer
. The difference is that elastic_integer
does nothing else but solve this single problem.
elastic_integer
offers little protection against out-of-range errors caused by narrowing or compound operations that increases the magnitude of the value:
elastic_integer<10> a = 1023; // within range; OK
auto b = elastic_integer<9>(a); // narrowing cast exceeds range; undefined behavior
++ a; // increment exceeds range; undefined behavior
a = -1024; // narrowing assignment exceeds range; undefined behavior
Such errors cannot be prevented using the type system alone. Either the user has to take care to avoid these errors or run-time checks must be performed. Fortunately, this can be remedied by combining elastic_integer
with a so-called 'safe' integer type, e.g.:
// a numeric component that traps out-of-range errors
template<class T>
overflow_integer;
overflow_integer<elastic_integer<10>> a = 1024; // trap!
auto aa = a*a; // no need for run-time check: num_digits_v<decltype(a*a)> is wide enough to hold the result
(This composite type is said to have a numeric component nesting depth of 2 because one numeric component is nested within another. elastic_integer<10, int>
has a nesting depth of 1 and int
has a nesting depth of 0.)
P0037 introduces a fixed-point type which uses integers to approximate real numbers:
template<class Rep, int Exponent>
class fixed_point;
Fixed-point arithmetic increases the opportunity for out-of-range errors. In the following example on a system where int
is 32 bits, a 31-digit integer is expected to store a 53-bit value, resulting in UB.
auto kibi = fixed_point<int32_t, -16>(1024); // 2^26
auto mebi = kibi * kibi; // fixed_point<int, -32>; value: 2^52
The problem here is with the backing type used to represent the fixed-point value -- not the fixed-point arithmetic per se. But the fact that fixed-point types often use more of their allotted capacity exacerbates the issue. So replacing int
with elastic_integer
produces a composite type that prevents the error:
template<int Digits, int Exponent>
using elastic_fixed_point = fixed_point<elastic_integer<Digits>, Exponent>;
auto kibi = elastic_fixed_point<31, -16>(1024); // stores value 2^26
auto mebi = kibi * kibi; // elastic_fixed_point<62, -32> stores value: 2^52
The benefits of the composite, elastic_fixed_point
is significant. Not only does it eliminate the risk of both overflow and underflow, but it does it with the minimum of integer arithmetic. A modern optimizing compiler reduces the above multiplication to a single integer operation:
auto kibi = 1024*65536;
auto mebi = int64_t{kibi}*kibi;
The default type for representing integer quantities is int
. It is the default for Narrowest
for the same reasons. This means that even a one-digit type (elastic_integer<1>
), is as wide as int
. When storage matters more than performance, int
is the wrong choice. For example:
struct date {
elastic_integer<15, int8_t> year; // [-32767..32767]
elastic_integer<4, uint8_t> month; // [0..15]
elastic_integer<5, uint8_t> day; // [0..31]
};
constexpr auto epochalypse = date{2038, 1, 19};
constexpr auto end_of_ice_age = date{-10000, 7, 19};
Typically, the sizes of the three member variables of date
are 2, 1 and 1 bytes respectively and most users can assume that sizeof(date)==4
.
The usability of numeric component types in arithmetic expressions is greatly affected by the set of types accepted by their operators. In the case of elastic_integer
, being able to perform operations between an elastic_integer
type and a non-elastic_integer
type is safe and convenient:
auto a = elastic_integer<2>(2) + 2; // equivalent to elastic_integer<2>(2) + elastic_integer(2)
A summary of the rules for binary operators taking a single elastic_integer
operand are as follows:
elastic_integer
operand is a static numeric component (e.g. int
or overflow_integer<int>
):
elastic_integer
operand is a dynamic numeric component (e.g. floating-point or bignum), the elastic_integer
operand is converted to the dynamic numeric component and operation is performed on the two dynamic numeric components.These rules should be applicable to all numeric components. However, the rules for when both operands are the same numeric component are specific to that component. In the case of elastic_integer
the rules vary depending on the operator.
It is uncertain whether allowing elastic_integer<0>
is wise or even possible.
For types instantiated from unsigned fundamental integers, e.g. elastic_integer<0, unsigned>
, only the value zero is possible. Whether such a type would require any storage at all is unclear. If not, it would be a special case because storage is normally rounded up to the width of the Narrowest
parameter. elastic_integer<0, void>
might be a way to express a zero-sized type. If a zero-sized integer becomes standardized, a zero-digit elastic type would suddenly make a lot more sense and zero storage would be expected.
For types instantiated from signed fundamental integers, there is the added question of the sign bit. It might be expected but would certainly be of little use. Even on two's complement systems where a zero-digit signed integer has the range [0..-1], the rule forbidding the most negative number would reduce the range to zero.
It is not at all clear that elasticity is an appropriate metaphor for this auto-widening behavior.
<elastic_integer>
synopsisnamespace std {
// class template elastic_integer
template<int Digits = num_digits_v<int>, class Narrowest = int> class elastic_integer;
// unary operators
template<int Digits, class Narrowest>
constexpr elastic_integer<Digits, Narrowest>
operator+(const elastic_integer<Digits, Narrowest>&);
template<int Digits, class Narrowest>
constexpr elastic_integer<Digits, std::make_signed_t<Narrowest>>
operator-(const elastic_integer<Digits, Narrowest>&);
// binary operators (see 4.1 for return types)
template<int Digits1, class Narrowest1, int Digits2, class Narrowest2>
constexpr auto
operator@(const elastic_integer<Digits1, Narrowest1>&, const elastic_integer<Digits2, Narrowest2>&);
template<int Digits, class Narrowest, class T>
constexpr auto
operator@(const elastic_integer<Digits, Narrowest>&, const T&);
template<class T, int Digits, class Narrowest>
constexpr auto
operator@(const T&, const elastic_integer<Digits, Narrowest>&);
// pre/post increment/decrement
template<int Digits, class Narrowest>
constexpr elastic_integer<Digits, Narrowest>&
operator++(elastic_integer<Digits, Narrowest>&);
template<int Digits, class Narrowest>
constexpr elastic_integer<Digits, Narrowest>
operator++(elastic_integer<Digits, Narrowest>&, int);
template<int Digits, class Narrowest>
constexpr elastic_integer<Digits, Narrowest>&
operator--(elastic_integer<Digits, Narrowest>&);
template<int Digits, class Narrowest>
constexpr elastic_integer<Digits, Narrowest>
operator--(elastic_integer<Digits, Narrowest>&, int);
// compound assignment operators
template<int Digits, class Narrowest, class T>
constexpr elastic_integer<Digits, Narrowest>&
operator@=(elastic_integer<Digits, Narrowest>&, const T&);
// numeric traits (see P0554 for details)
template<int Digits, class Narrowest>
struct digits<elastic_integer<Digits, Narrowest>>;
template<int Digits, class Narrowest, int MinNumBits>
struct set_digits<elastic_integer<Digits, Narrowest>, MinNumBits>;
template<int Digits, class Narrowest>
constexpr typename elastic_integer<Digits, Narrowest>::rep
to_rep(const elastic_integer<Digits, Narrowest>&);
template<int Digits, class Narrowest, class Rep>
struct from_rep<elastic_integer<Digits, Narrowest>, Rep>;
template<int Digits, class Narrowest, class Value>
struct from_value<elastic_integer<Digits, Narrowest>, Value>;
template<int ShiftDigits, int ScalarDigits, class ScalarNarrowest>
struct shift<ShiftDigits, 2, elastic_integer<ScalarDigits, ScalarNarrowest>>;
}
elastic_integer
namespace std {
template<int Digits, class Narrowest>
class elastic_integer {
public:
using rep = set_num_digits_t<Narrowest, max(Digits, num_digits_v<Narrowest>)>;
private:
rep rep_; // exposition only
public:
elastic_integer() = default;
template<class N> constexpr elastic_integer(const N&);
template<class N> elastic_integer& operator=(N s);
template<class N> explicit constexpr operator N() const;
};
}
This section explores feedback raised in review.
Q: Why might the signedness of the result of unary operator+
be signed, even when the input is unsigned?
A: The reason to consider returning a signed result is symetry. When using native integers, -1U
is 0xffffffffU
. In other words, both signed and unsigned fundamental types return the same signedness for unary plus and minus. In the case of unsigned integers, it gives us the much-lamented off-by-4.3-billion error. However, there is little reason to make the result signed in the case of unary operator+
.
Q: If converting from unsigned to signed, shouldn't the number of bits be increased by one in case the unsigned number coming in can't fit?
A: The incoming number is guaranteed to fit for two reasons: Firstly, Digits
does not include the sign bit so the type will be a bit wider but Digits
will remain unchanged. Secondly, signed elastic_integer
has a symetric range around zero, i.e. the most negative number is not a valid value. (See section 2.4 for more details.)
Q: Why don't post and pre-increment, post and pre-decrement and compound assignment operators return wider types?
A: elastic_integer
returns a wider type when it is opportune to do so but it cannot perform miracles. The pre/post and compount assignment operators mutate the value passed to them so returning a wider type is either partially or totally futile. (See section 4.1.1 for details.)
Q: Why are mutating operators such as pre-increment and compound assignment marked as constexpr
.
A: So they can be used in constant expressions. (Example)
Thanks to Arthur O'Dwyer for advice on customization points.