From owner-sc22wg5@dkuug.dk  Thu Oct  9 20:50:37 2003
Received: (from majordom@localhost)
	by dkuug.dk (8.12.10/8.9.2) id h99Iobw7025695
	for sc22wg5-domo; Thu, 9 Oct 2003 20:50:37 +0200 (CEST)
	(envelope-from owner-sc22wg5@dkuug.dk)
X-Authentication-Warning: ptah.dkuug.dk: majordom set sender to owner-sc22wg5@dkuug.dk using -f
Received: from math.jpl.nasa.gov (math.jpl.nasa.gov [137.79.7.57])
	by dkuug.dk (8.12.10/8.9.2) with ESMTP id h99IoREt025690
	for <sc22wg5@dkuug.dk>; Thu, 9 Oct 2003 20:50:32 +0200 (CEST)
	(envelope-from vsnyder@mls.jpl.nasa.gov)
Received: from math.jpl.nasa.gov (localhost.localdomain [127.0.0.1])
	by math.jpl.nasa.gov (8.12.8/8.12.8) with ESMTP id h99IogAQ005018
	for <sc22wg5@dkuug.dk>; Thu, 9 Oct 2003 11:50:42 -0700
Received: from math.jpl.nasa.gov (vsnyder@localhost)
	by math.jpl.nasa.gov (8.12.8/8.12.8/Submit) with ESMTP id h99Iof46005014
	for <sc22wg5@dkuug.dk>; Thu, 9 Oct 2003 11:50:42 -0700
Message-Id: <200310091850.h99Iof46005014@math.jpl.nasa.gov>
X-Mailer: exmh version 2.5 01/15/2001 with nmh-1.0.4
Reply-to: Van.Snyder@jpl.nasa.gov
From: Van.Snyder@jpl.nasa.gov
To: sc22wg5@dkuug.dk
Subject: Wishes for future revisions of Fortran
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 09 Oct 2003 11:50:41 -0700
X-Spam-Score: 0.339 () NO_REAL_NAME
Sender: owner-sc22wg5@dkuug.dk
Precedence: bulk


I've just spent 40 hours tracking down a bug that turned out to have
resulted from a typo introduced during manual inlining of a procedure
with four executable statements (other than the END statement).  The
procedure is referenced in dozens of places in a code of 140,000 or so
lines.  Inlining it makes a difference of a factor of eight in the run
time, which was 14 hours on each of 386 3 GHz Pentium Xeon processors
before the inlining.  The bug arose in a case where one of the arguments
is a product of three arrays, each of which has a vector subscript, each
subscript of which has a vector subscript, each of which is a section. 
Something like A(A1(A2(I:J)))*B(B1(B2(I:J)))*C(C1(C2(I:J))).  Naturally,
this reference is the one that makes the most difference in the run time.
The reference is within an inner loop.  I DID try creating an array temp
outside the loop to reduce the stress on the memory allocator.  This
provided an improvement, but not nearly as dramatic an improvement as
inlining did.

Can we PLEASE PLEASE PLEASE have an INLINE attribute for subprograms in a
future revision of the Fortran standard -- preferably the next one after
2003?

Words something like

 The INLINE attribute of a subprogram indicates that a substantial
 performance improvement would result if the subprogram were to be
 materialized in place of references to it.  The INLINE attribute does
 not change the interpretation of the subprogram or a reference to it.  A
 subprogram with the INLINE attribute shall not have the RECURSIVE
 attribute.  An internal subprogram has the INLINE attribute if the
 subprogram within which it is defined has the INLINE attribute.

would be acceptable.  The last restrictions could, of course, be
constraints.  Inlining need not be a requirement put onto the processor. 
Advice is good enough.  I wouldn't object to additional restrictions.
E.g., it may be necessary to prohibit it to have a local variable with
the SAVE attribute, or to prohibit it to have an internal subprogram
instead of specifying that the INLINE attribute applies to any internal
subprograms as well.

The "same interpretation" requirement means that the inlined procedure
does NOT access the environment where it is inlined except by way of the
actual arguments.

I realize there may be technical difficulties in inlining a module
subprogram into a different program unit if the inlined subprogram
accesses private module variables by host association.  It's probably
also impossible to inline type-bound references to it -- at least ones
from polymorphic variables.  But being able to put the advice into a
program's text means that when a processor eventually becomes able to do
the inlining, compiling the program gives an immediate boost in
performance.

No, compiler heuristics and command line arguments are not adequate,
unless the standard standardizes them -- which I think would be more work
than standardizing an INLINE attribute, and a big mistake as well. 
Depending on command-line arguments is a portability headache for the guy
who developes or maintains the make files.

--
Van Snyder                    |  What fraction of Americans believe 
Van.Snyder@jpl.nasa.gov       |  Wrestling is real and NASA is fake?
Any alleged opinions are my own and have not been approved or disapproved
by JPL, CalTech, NASA, Sean O'Keefe, George Bush, the Pope, or anybody else.


