From J.Reid@letterbox.rl.ac.uk  Mon Feb  5 10:47:33 1996
Received: from letterbox.rl.ac.uk (letterbox.rl.ac.uk [130.246.8.100]) by dkuug.dk (8.6.12/8.6.12) with SMTP id KAA17378 for <SC22WG5@dkuug.dk>; Mon, 5 Feb 1996 10:47:13 +0100
Received: from jkr.cc.rl.ac.uk by letterbox.rl.ac.uk with SMTP (PP) 
          id <sg.27327-0@letterbox.rl.ac.uk>; Mon, 5 Feb 1996 09:45:45 +0000
Received: by jkr.cc.rl.ac.uk (4.1/SMI-4.1) id AA22687;
          Fri, 2 Feb 96 18:29:06 GMT
Date: Fri, 2 Feb 96 18:29:06 GMT
From: jkr@letterbox.rl.ac.uk (John Reid)
Message-Id: <9602021829.AA22687@jkr.cc.rl.ac.uk>
To: SC22WG5@dkuug.dk
Subject: Exceptions technical report


I have made some minor changes to the draft technical report following
email with Christian Weber and Keith Bierman. The changes are shown 
with change bars. The next stage is to construct the actual edits,
but before doing this, I would like to know if the technical 
specification meets with people's approval. In particular, I am hoping
that X3J3 will look at this next week.

Best wishes,

John Reid,


DRAFT TECHNICAL REPORT FOR FLOATING-POINT EXCEPTION HANDLING

2 February 1996

0. NOTES

This is a draft Technical Report using procedures in two modules for
exception handling. It is based on work done by X3J3 in San Diego in
November 1995.

Please consider the following changes:

1. Add more procedures. I am asking Larry Rolison to make a case.

2. Change IEEE_MODE to IEEE_LEVEL. This occured to me on return as a
better name.

3. Make IEEE_TEST_FLAG into a subroutine. The case for a function is
that it is more convenient in use (see example, below) and is
consistent with existing practice. The case for a subroutine is that it
would give a clear separation between statements that can change the
flag values and those that can access them. It would also avoid
breaking the rule that a function value should depend solely on
argument values, which applies to all the intrinsic functions. The
opinions I have so far are:
     Function: Bierman, Hendrickson, Whitlock.
     Subroutine: Reid, Wagener. 


1. RATIONALE

Exception handling is required for the development of robust and
efficient numerical software.  In particular, it is necessary in order
to be able to write portable scientific libraries.  In numerical
Fortran programming, current practice is to employ whatever exception
handling mechanisms are provided by the system/vendor.  This clearly
inhibits the production of fully portable numerical libraries and
programs. It is particularly frustrating now that IEEE arithmetic is so
widely used, since built into it are the five conditions: overflow,
invalid, divide_by_zero, underflow, and inexact. Our aim to to provide
support for these conditions.

This proposal involves two standard modules, IEEE_ARITHMETIC and
IEEE_SUBSET, each containing four derived types, some named constants
of these types, an integer named constant IEEE_MODE, and a collection
of simple procedures. They allow the flags to be tested, cleared, set,
saved, or restored. To facilitate maximum performance, each of the
proposed functions does very little processing of arguments. In many
cases, a processor may generate only a few inline machine code
instructions rather than library calls. Any feature supported on a
processor for IEEE_SUBSET must also be supported for IEEE_ARITHMETIC.
Some processors may execute faster using IEEE_SUBSET.

In order to allow for the maximum number of processors to provide the
maximum value to users, we do NOT require complete IEEE conformance.
What a vendor must do is to provide the module, and support the
overflow and divide-by-zero flags and the basic inquiry functions. A
user must utilize the inquiry functions to determine if he or she can
count on specific features of the IEEE standard, before using other
inquiry or value setting procedures.

Note that a processor ought not implement these as "macros", as IEEE
conformance is often controlled by compiler switches. A processor which
offers a switch to turn off a facility should adjust the values
returned for these inquiries. For example, a processor which allows
gradual underflow to be turned off (replaced with flush to zero) should
return .FALSE. for IEEE_SUPPORT_DENORMAL(X) when a source file is
processed with that option on. Naturally it should return .TRUE. when
that option is not in effect.

The most important use of a floating-point exception handling facility
is to make possible the development of much more efficient software
than is otherwise possible. The following "hypotenuse" function,
sqrt(x**2 + y**2), illustrates the use of the facility in developing
efficient software.

REAL FUNCTION HYPOT(X, Y)
! In rare circumstances this may lead to the setting of the OVERFLOW flag
   USE, INTRINSIC :: IEEE_ARITHMETIC 
   REAL X, Y
   REAL SCALED_X, SCALED_Y, SCALED_RESULT
   LOGICAL OLD_FLAGS(2)
   TYPE (IEEE_FLAG_TYPE), PARAMETER :: &
          OUT_OF_RANGE(2) = (/ IEEE_OVERFLOW, IEEE_UNDERFLOW /)
   INTRINSIC SQRT, ABS, EXPONENT, MAX, DIGITS, SCALE
! Store the old flags and clear them
   OLD_FLAGS = IEEE_TEST_FLAG(OUT_OF_RANGE)
   CALL IEEE_CLEAR_FLAG(OUT_OF_RANGE)  
! Try a fast algorithm first
   HYPOT = SQRT( X**2 + Y**2 )
   IF ( ANY (IEEE_GET_FLAG(OUT_OF_RANGE)) ) THEN
      CALL IEEE_CLEAR_FLAGS(OUT_OF_RANGE)  
      IF ( X==0.0 .OR. Y==0.0 ) THEN
         HYPOT = ABS(X) + ABS(Y)
      ELSE IF ( 2*ABS(EXPONENT(X)-EXPONENT(Y)) > DIGITS(X)+1 ) THEN
         HYPOT = MAX( ABS(X), ABS(Y) )!  one of X and Y can be ignored
      ELSE     ! scale so that ABS(X) is near 1
         SCALED_X = SCALE( X, -EXPONENT(X) )
         SCALED_Y = SCALE( Y, -EXPONENT(X) )
         SCALED_RESULT = SQRT( SCALED_X**2 + SCALED_Y**2 )
         HYPOT = SCALE( SCALED_RESULT, EXPONENT(X) ) ! may cause overflow
      END IF        
   END IF        
   IF(OLD_FLAGS(1)) CALL IEEE_SET_FLAGS(IEEE_OVERFLOW)  
   IF(OLD_FLAGS(2)) CALL IEEE_SET_FLAGS(IEEE_UNDERFLOW)  
END FUNCTION HYPOT

An attempt is made to evaluate this function directly in the fastest
possible way. (Note that with hardware support, exception checking is
very efficient; without language facilities, reliable code would
require programming checks that slow the computation significantly.)
The fast algorithm will work almost every time, but if an exception
occurs during this fast computation, a safe but slower way evaluates
the function.  This slower evaluation may involve scaling and
unscaling, and in (very rare) extreme cases this unscaling can cause
overflow (after all, the true result might overflow if X and Y are both
near the overflow limit).  If the overflow or underflow flag is set on
entry, it is reset on return, so that earlier exceptions are not
lost.

Can all this be accomplished without the help of an exception handling
facility? Yes, it can - in fact, the alternative code can do the job,
but of course it is much less efficient. That's the point. The HYPOT
function is special, in this respect, in that the normal and
alternative codes try to accomplish the same task. This is not always
the case. In fact, it very often happens that the alternative code
concentrates on handling the exceptional cases and is not able to
handle all of the non-exceptional cases. When this happens, a program
which cannot take advantage of hardware flags could have a structure
like the following:
    if ( in the first exceptional region ) then
        handle this case
    else if ( in the second exceptional region ) then
        handle this case
      :
    else
        execute the normal code
    end
But this is not only inefficient, it also "inverts" the logic of the
computation.  For other examples, see Hull, Fairgrieve and Tang (1994)
and Demmel and Li (1994).

The code for the HYPOT function can be generalized in an obvious
way to compute the Euclidean Norm, sqrt( x(1)**2 + x(2)**2 + ... +
x(n)**2 ), of an n-vector; the generalization of the alternative code is
not so obvious (though straightforward) and will be much slower
relative to the normal code than is the case with the HYPOT function.

Of the five IEEE conditions, underflow and inexact are obviously milder
than the other three (inexact is particularly mild and signals almost
all the time in typical codes). We therefore group the other three
together as USUAL.

There is a need for further intrinsic conditions in connection with
reliable computation. Examples are 
   a. INSUFFICIENT_STORAGE for when the processor is unable to find
      sufficient storage to continue execution.
   b. INTEGER_OVERFLOW and INTEGER_DIVIDE_BY_ZERO for when an 
      intrinsic integer operation has a very large result or 
      has a zero denominator. 
   c. INTRINSIC for when an intrinsic procedure has been unsuccessful.
   d. SYSTEM_ERROR for when a system error occurs. 
This proposal has been designed to allow such enhancements in the future.  

References

Demmel, J.W. and Li, X. (1994). Faster Numerical Algorithms via 
   Exception Handling. IEEE Transactions on Computers, 43,
   no. 8, 983-992.
Hull, T.E., Fairgrieve, T.F., and Tang, T.P.T. (1994).
   Implementing complex elementary functions using exception handling.
   ACM Trans. Math. Software 20, 215-244. 


2. TECHNICAL SPECIFICATION

2.1 The model

This proposal is based on the IEEE model with flags for the
floating-point exceptions (invalid, overflow, divide-by-zero,
underflow, inexact), a flag for the rounding mode (nearest, up, down,
to zero), and flags for whether halting occurs following exceptions. It
is not necessary for the hardware to have any such flags (they may be
simulated by software) or for it to support all the modes. Inquiry
procedures are available for determining the extent of support. The
minimum requirement is for the support of overflow and divide-by-zero.

As well as the IEEE exceptions, the model contains the exceptions
'usual' and 'all'. The flag usual is regarded as set if any of invalid,
overflow, divide-by-zero is set. Explicitly setting or clearing usual
(by invoking a procedure) has the corresponding effect on all three
flags. The flag all is similarly related to all five flags.

Some hardware may execute faster with compiled code that provides only
partial support of these features than with code than provides fuller
support. This proposal therefore involves two intrinsic modules, which
gives the programmer control over the extent of support. They are named
IEEE_ARITHMETIC and IEEE_SUBSET. Any feature supported on a processor
for IEEE_SUBSET must also be supported for IEEE_ARITHMETIC.

The modules contain four derived types (section 2.3), an integer
constant to control the level of support (section 2.4), and a
collection of procedures (sections 2.5 to 2.10). None of the procedures
is permitted as an actual argument.


2.2  The use statement for an intrinsic module

New syntax on the USE statement provides control over whether it is
intended to access an intrinsic module:
      USE, INTRINSIC :: IEEE_ARITHMETIC
or not:
      USE, NON_INTRINSIC :: MY_IEEE_ARITHMETIC
The INTRINSIC statement is not extended. For the Fortran 90 form
      USE IEEE_ARITHMETIC
the processor looks first for a non-intrinsic module.


2.3 The derived types

The modules contain the following derived types

  1. IEEE_FLAG_TYPE, for identifying a particular exception flag. Its
     only possible values are those of constants defined in the module:
     IEEE_INVALID, IEEE_OVERFLOW, IEEE_DIVIDE_BY_ZERO, IEEE_USUAL,
     IEEE_UNDERFLOW, IEEE_INEXACT, and IEEE_ALL.

  2. IEEE_CLASS_TYPE, for identifying a class of floating-point values.
     Its only possible values are those of constants defined in the
     module: IEEE_SIGNALING_NAN, IEEE_QUIET_NAN, IEEE_NEGATIVE_INF,
     IEEE_NEGATIVE_NORMAL, IEEE_NEGATIVE_DENORMAL, IEEE_NEGATIVE_ZERO,
     IEEE_POSITIVE_ZERO, IEEE_POSITIVE_DENORMAL, IEEE_POSITIVE_NORMAL,
     IEEE_POSITIVE_INF.

  3. IEEE_ROUND_TYPE, for identifying a particular rounding mode. Its
     only possible values are those of constants defined in the module:
     IEEE_NEAREST, IEEE_TO_ZERO, IEEE_UP,  IEEE_DOWN, and IEEE_OTHER.

  4. IEEE_STATUS_TYPE, for saving the current floating point status.


2.4 The level of support

Both modules contain the constant IEEE_MODE of type default integer.
It has the value 1 in IEEE_SUBSET and the value 2 in IEEE_ARITHMETIC.

The extent of IEEE support must be no less for IEEE_ARITHMETIC than for
IEEE_SUBSET. If a scoping unit has access to IEEE_MODE of
IEEE_ARITHMETIC, the scoping unit must provide the corresponding level
of support. If a scoping unit has access to IEEE_MODE of IEEE_SUBSET
and does not have access to IEEE_MODE of IEEE_ARITHMETIC, the scoping
unit must provide the subset level of support. Execution may be faster
with the subset. If a scoping unit has access to neither IEEE_MODE,
the level of support is processor determined, and need not include
support for any exceptions. Following execution of code in such a        |
scoping unit, the values of the flags are processor determined.          |


2.5 The exception flags

The flags are initially clear and are set when an exception occurs. The
value of a flag is determined by the elemental function

    IEEE_TEST_FLAG(FLAG)

where FLAG is of type IEEE_FLAG_TYPE. Being elemental allows an array
of flag values to be obtained at once and obviates the need for a list
of flags.

Flags may be cleared by the elemental subroutine

    IEEE_CLEAR_FLAG(FLAG)

and may be set by the elemental subroutine

    IEEE_SET_FLAG(FLAG)


2.6 Inquiry functions for the features supported

The minimum requirement when IEEE_MODE of either mode are accessible
is for the support of IEEE_OVERFLOW and IEEE_DIVIDE_BY_ZERO. Whether
the other exceptions are supported may be determined by the elemental
inquiry function:

IEEE_SUPPORT_FLAG(FLAG [, X] )  True if the processor supports an
  exception flag for all reals (X absent) or for reals of the same kind
  type parameter as the argument X.

The extent of further support for the IEEE standard may be determine by
the inquiry functions:

IEEE_SUPPORT_DATATYPE ([X])  True if the processor supports IEEE arithmetic |
  for all reals (X absent) or for reals of the same kind type parameter as
  the argument X.  Here support means employing an IEEE data format and
  performing the operations of +, -, and * as in the IEEE standard
  whenever the operands and result all have normal values.

IEEE_SUPPORT_STANDARD([X])  True if the processor supports all the IEEE     |
  facilities defined in this standard for all reals (X absent) or for
  reals of the same kind type parameter as the argument X.

IEEE_SUPPORT_DENORMAL([X])  True if the processor supports the IEEE
  gradual underflow facility for all reals (X absent) or for reals of
  the same kind type parameter as the argument X.
  
IEEE_SUPPORT_DIVIDE([X])  True if the processor supports IEEE divide
  for all reals (X absent) or for reals of the same kind type parameter
  as the argument X.

IEEE_SUPPORT_HALTING(FLAG [, X] )  True if the processor supports the
  ability to control during program execution whether to abort or
  continue execution after an exception.

IEEE_SUPPORT_INF([X])  True if the processor supports the IEEE infinity
  facility for all reals (X absent) or for reals of the same kind type
  parameter as the argument X.

IEEE_SUPPORT_NAN([X])  True if the processor supports the IEEE
  Not-A-Number facility for all reals (X absent) or for reals of the
  same kind type parameter as the argument X.

IEEE_SUPPORT_ROUNDING ( ROUND_VALUE [, X] )  True if the processor
  supports changing to a particular rounding mode for all reals (X
  absent) or for reals of the same kind type parameter as the argument
  X.

IEEE_SUPPORT_SQRT([X])  True if the processor supports IEEE square root
  for all reals (X absent) or for reals of the same kind type parameter
  as the argument X.


2.7. Elemental functions

The modules contain the following elemental functions:

IEEE_CLASS(X)  Returns a value of type IEEE_CLASS_TYPE for X (see
  Section 2.3 for the possible values).

IEEE_COPY_SIGN(X,Y)  IEEE copysign function, that is X with the sign of Y.

IEEE_IS_FINITE(X)  IEEE finite function. True if CLASS(X) has one of
  the values IEEE_NEGATIVE_NORMAL, IEEE_NEGATIVE_DENORMAL,
  IEEE_NEGATIVE_ZERO, IEEE_POSITIVE_ZERO, IEEE_POSITIVE_DENORMAL,
  IEEE_POSITIVE_NORMAL.

IEEE_IS_NAN(X)  True if the value is IEEE Not-a-Number.

IEEE_IS_NEGATIVE(X)  True if the value is negative.

IEEE_IS_NORMAL(X)  True if the value is a normal number.

IEEE_LOGB(X) IEEE logb function, that is, the unbiased exponent of X.

IEEE_NEXTAFTER(X,Y)  Returns the next representable neighbor of X in
  the direction toward Y.

IEEE_SCALB (X,I)  Returns X * 2**I.

IEEE_SQRT(X)  Square root as defined in the IEEE standard.

IEEE_UNORDERED(X,Y)  IEEE unordered function. True if X or Y is a NaN
  and false otherwise.

IEEE_VALUE(X, CLASS)  Generate an IEEE value of the kind of X. The
  value of CLASS is permitted to be
   (i) IEEE_SIGNALING_NAN or IEEE_QUIET_NAN if IEEE_SUPPORT_NAN(X) has
       the value true,
   (ii) IEEE_NEGATIVE_INF or IEEE_POSITIVE_INF if IEEE_SUPPORT_INF(X)
       has the value true, 
   (iii) IEEE_NEGATIVE_ZERO if IEEE_SUPPORT_DATATYPE(X) has the value    |
       true, or
   (iv)  IEEE_POSITIVE_ZERO. 
  In the case of a NaN, the value is processor dependent but fixed for
  each kind.


2.8. Elemental subroutines

The modules contain the following elemental subroutines:

IEEE_CLEAR_FLAG (FLAG) Clear an exception flag.

IEEE_SET_FLAG(FLAG) Set an exception flag.


2.9. Non-elemental subroutines

The modules contain the following non-elemental subroutines:

IEEE_GET_STATUS(STATUS_VALUE)  Get the current values of the set of
  flags that define the current state of the floating point
  environment.  STATUS_VALUE is of type IEEE_STATUS_TYPE.

IEEE_GET_ROUNDING_MODE(ROUND_VALUE)  Get the current IEEE rounding
  mode.  ROUND_VALUE is of type IEEE_ROUND_TYPE.

IEEE_GET_HALTING_MODE(FLAG [, X] )  Get halting mode for an exception.
  FLAG is of type IEEE_FLAG_TYPE. The initial halting mode is processor   |
  dependent. Halting is not necessary immediate, but normal processing    |
  does not continue.                                                      |

IEEE_HALT( FLAG, HALTING [, X] )  Controls continuation or halting on
  exceptions.  FLAG is of type IEEE_FLAG_TYPE. HALTING is of type
  default logical. 

IEEE_SET_STATUS(STATUS_VALUE)  Restore the values of the set of flags
  that define the current state of the floating point environment
  (usually the floating point status register). STATUS_VALUE is of type
  IEEE_STATUS_TYPE and has been set by a call of IEEE_GET_STATUS.

IEEE_SET_ROUNDING_MODE(ROUND_VALUE)  Set the current IEEE rounding
  mode.  ROUND_VALUE is of type IEEE_ROUND_TYPE, with value
  IEEE_NEAREST, IEEE_TO_ZERO, IEEE_UP, or IEEE_DOWN.


2.10. Transformational function

The modules contain the following transformational function:

IEEE_SELECTED_REAL_KIND ( [P,] [R]) As for SELECTED_REAL_KIND but gives
  an IEEE kind.