From J.L.Schonfelder@liverpool.ac.uk Fri Mar 18 11:24:15 1994
Received: from mailhub.liverpool.ac.uk (mail.liv.ac.uk) by dkuug.dk with SMTP id AA04272
  (5.65c8/IDA-1.4.4j for <SC22WG5@dkuug.dk>); Fri, 18 Mar 1994 12:24:52 +0100
Received: from liverpool.ac.uk by mailhub.liverpool.ac.uk with SMTP (PP) 
          id <23136-0@mailhub.liverpool.ac.uk>; Fri, 18 Mar 1994 11:24:26 +0000
From: "Dr.J.L.Schonfelder" <J.L.Schonfelder@liverpool.ac.uk>
Message-Id: <9403181124.AA20985@uxh.liv.ac.uk>
Subject: constructor Enhancements
To: SC22WG5@dkuug.dk (SC22/WG5 members)
Date: Fri, 18 Mar 1994 11:24:15 +0000 (GMT)
X-Mailer: ELM [version 2.4 PL23]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 16978
X-Charset: ASCII
X-Char-Esc: 29

This is a replacement for the previous paper on this topic. revised
as a result of comments. (David again this or the postscript version
should replace the existing version)
------------------------------------------------------------
 To: X3J3 and WG5 (18 March 1994)
 From: Lawrie Schonfelder 
 References: X3J3-94/107 and 108
  
                          Constructor Enhancements
 
 1 Introduction     
 At meeting 128 as a result of both full committee discussion and
 consideration in the OOF subgroup it was suggested that the
 current constructor was both inconsistent and unfriendly. This
 paper is a proposal following the lines indicated in the above
 papers which were given in principle support by the full
 committee.
     The constructor as defined in F90 has the syntactic form of
 a function reference but unlike any other intrinsically defined
 function it cannot be referenced using keyword arguments, nor can
 the name be overloaded by the user. These two shortcomings are
 essentially separate except that if the latter is permitted then
 the former comes as a byproduct. The proposal is therefore made
 in two parts. The first is to allow the use of component names as
 keywords in a constructor reference. The second is to redefine
 the constructor as a generic function so as to allow the name to
 be overloaded by the user.
 
 2 Technical Description 
 2.1 Keyword arguments in constructors
 The aim here is to make the references to constructors more user
 friendly, especially when structures having a large number of
 components are constructed. The proposal is to help the user to
 specify the expression/component correspondence using either
 positional or keyword argument syntax, with the keywords being
 the component names. For example, given a type defined by
 
 TYPE STOCK_ITEM
   INTEGER :: id,holding,buy_level
   CHARACTER(LEN=20) :: desc
   REAL :: buy_price,sell_price
 ENDTYPE STOCK_ITEM
 
 the two constructor references below would mean the same thing.
 
 STOCK_ITEM(12345,75,10,"Pencils HB",1.56,2.49)
 
 STOCK_ITEM(desc="Pencils HB", id=12345, &
            holding=75, sell_price=2.49, &
            buy_level=10, buy_price=1.56 )
 
 The latter is substantially more indicative of the intent of the
 reference. It makes the correspondence between component and
 expression obvious and incidentally makes the use of assignment
 semantics for establishing the resulting component value equally
 obvious. 
     This last point raises a question. If intrinsic assignment
 has been overridden by a user defined assignment for a particular
 expression and component combination, shouldn't this user defined
 assignment be used to determine the constructed component value?
 The edits proposed in the initial version of this paper assumed
 that the answer to this question should be yes. It was pointed
 out by John Reid that this would  unfortunately lead to an
 incompatibility with F90. So in this version the edits are
 modified to give the answer no. The intrinsic assignment
 semantics are always used for the component expression
 assignment. 
     For pointer components the keyword would still be the
 component name and the component= form used although now the
 intrinsic assignment semantics implied are those of pointer
 assignment and the expression must deliver a result that has the
 target attribute.
     As with actual arguments, positional correspondence should
 be permitted up to the first use of the keyword form, all
 subsequent component/expression arguments would have to be of the
 keyword form.
 {{{      Note, if parameterised data types are added the only
           addition to this proposal that is needed is that the
           type parameter names may be used as keywords to
           indicate the correspondence for parameter value
           expressions                                                    }}}
 
 2.2 Allowing user defined overloads
 This change actually requires few edits to the standard. It
 merely requires the constructor to be defined as a generic
 procedure, and hence allowing the users to provide their own
 overloads for this by use of the generic interface block. 
     However, this makes the definition of class 1 names even
 more inconsistent that it currently is. Even as it stands the
 definition of class 1 is flawed. The type name is not unique in
 the class; the type name appears both as the name of the type and
 as the name of the type value constructor. An additional, but
 optional part of this proposal would place the type names into a
 separate class (there already exists a rule that says that the
 set of all accessible types must be uniquely named which is
 essentially what is meant by a class of name) along with the
 constructor name being considered to be that of a generic
 function. This is essentially extending the situation that
 already applies to REAL to all derived types.
     For a type such as
 
 TYPE RATIONAL
   INTEGER :: num,den
 ENDTYPE RATIONAL
 
 This would allow a user defined constructor to be specified that
 would convert a normal integer value, say, to a rational number,
 
 INTERFACE RATIONAL
   FUNCTION int_to_rat(int)
     type(RATIONAL)::int_to_rat
     INTEGER::int
   ENDFUNCTION int_to_rat
   .....
 ENDINTERFACE
 
 Of course it would also be useful in a such a case to have other
 generic overloads for the constructor, such as real to rational
 or even a character string literal denoting a real value to
 rational.
     In considering the class of names situation one should note
 that the primary role of the concept of class of name is to
 provide an easy way of describing the restrictions that apply to
 the use of names for different entities. In particular the
 concept is supposed to indicate clearly when a particular name is
 required to be unique and when it might be used to identify more
 than one program entity. Such rules help describe those features
 of the language designed to make it impossible to write a legal
 program that is ambiguous. The concept of class was invented to
 enable the expression of the rule that a reference via a given
 name should not identify more than one program entity in the same
 class.  Unfortunately the way it has been used does not in fact
 do this, and it also causes a number of restrictions to be
 applied which render entirely unambiguous programs illegal.
     The principle that is to be followed in the discussion below
 is that the restrictions should be so phrased and the classes so
 defined that:
     -    names identifying specific entities are in fact unique
           within a class,
     -    names identifying generic references are restricted
           only by the overriding rule that no ambiguity is
           allowed.
 Following these principles leads to maximum freedom to the user
 in choice of names and greatly reduces name space management
 problems for the user.
     The current class rules lump together a whole raft of very
 different entity names into a single class. This results in the
 application of uniqueness restrictions not otherwise required by
 either syntax or the semantics of the language and it fails to
 apply some uniqueness restrictions which are required. The
 definition of class 1 local names,
 
     "Named variables that are not statement entities, named
      constants, named constructs, statement functions, internal
      procedures, module procedures, dummy procedures, intrinsic
      procedures, generic identifiers, derived types, and namelist
      group names"
 
 includes construct names, type names and generic procedure names
 in the same class as variables, constants, specific procedure
 names, etc. 
     Construct names are totally distinguished by context. A name
 used as a construct name can not be ambiguous with any other use
 of such a name. In fact the essential scope of a construct name
 is actually the construct which it identifies, although it is not
 proposed to go that far here. There is no possibility for
 confusion of a construct name with an other object identified by
 a class 1 name. As a step in the direction of removing
 unnecessary restrictions it is proposed that construct names be
 removed from the class 1 list and placed in a class by
 themselves.
 {{{      Note, construct names behave very similarly to
           statement labels. If at any time we were to allow
           alphanumeric labels then the class of construct names
           and labels would be the one requiring local uniqueness.
                                                                          }}}
     Type names are required to be unique within the class of
 type names by restrictions applied elsewhere in the standard, but
 this restriction is not applied by the class rules. In spite of
 the fact that the sole function of the class concept and rules is
 to apply such uniqueness restrictions. However, the type name
 enters the class 1 as the name of two different entities, the
 name of the type and the name of the constructor.     Apart from
 the syntactic identity between a function reference and a
 constructor reference there is no context in which the type name
 is not syntactically distinguished from a possible use of the
 name as some other entity. However, it is precisely this identity
 which on the face of it places the type name in class 1. 
     Quite deliberately generic procedure names are names that
 are intended to identify a number of different specific entities,
 where the reference necessarily contains other information
 (arguments) which must serve to disambiguate the reference. Since
 by definition generic procedure names behave in all sorts of ways
 that are different from all other names it is hardly surprising
 if trying to include them within the rules which apply to the use
 of other names causes problems. 
     In section 5. there is a statement that is taken over from
 F77 to the effect that the declaration of a generic name in a
 type statement is not sufficient to remove the generic property
 of the name. This has been taken to mean that if a name is
 declared in this way and subsequently used in what is clearly a
 variable reference, the name in fact no longer has the generic
 property. However, if the name was used as a procedure reference
 either one that confirms the declared type of one of the other
 generic references, both procedure references remain valid. For
 example, 
 
 REAL :: SIN,x,y
 COMPLEX :: z
 ...
 x=SIN(y)
 z=SIN((x,y))
  
 is a valid program with both references to SIN being to the
 generic function SIN. However, if this was slightly modified,
 
 REAL :: SIN,x,y
 COMPLEX :: z
 ...
 x=SIN
 z=SIN((x,y))
 
 so that now the first reference to SIN is to a local variable
 called SIN. This now results in the second reference becoming
 invalid, in spite of it remaining entirely unambiguously
 resolvable as a generic reference to the function SIN with a
 complex argument. This behaviour is highly counter intuitive to
 many experienced programmers let alone being easily explicable to
 the novice. The standard is seriously unclear about this area and
 the inclusion of generic names in class 1 does little to help
 since the text of the standard promptly excludes generic
 procedure names from the uniqueness requirement.
     A significant clarification in this area would to treat
 generic names as an entirely separate category of name subject to
 their own rules and with a precisely defined relationship with
 objects in class 1. It is suggested that the essential
 requirement for the rules should be such that all references made
 using a particular generic identifier should be unambiguous, but
 there is no justification for being more restrictive than that.
 In the above example there is no ambiguity, The entity being
 referenced by the name SIN in each case is totally distinguished
 by the context; one is unambiguously a reference to a scalar
 variable the other is to a generic procedure.
     A more difficult case could be say the following,
 
 REAL,DIMENSION(5)::real,x,y
 COMPLEX :: z
 ...
 x=real(y)
 y=real(2)
 z=real((x,y))
 
 The first and last references to real should not cause any
 problem they are unambiguously references to the generic
 conversion function real(), the first with a real argument the
 last with a complex argument. It is the middle reference that is
 potentially ambiguous. Is this a reference to the array element
 real(2) or to the generic function converting the integer value
 2 to the corresponding real value. Most programmers would I think
 assume that the class 1 entity, the array element reference, is
 what is intended and in such cases this is what should be
 provided. The variable use should mask the specific clashing
 function reference. However, it should not as appears to be the
 current interpretation, mask all generic references and hence
 render this program illegal.
     Provided the constructor is classified as a generic
 procedure, the type names could be classified as forming a class
 of their own where strict uniqueness would be required and where
 intrinsic and derived types would form part of the same class;
 thereby removing another unnecessary irregularity in the
 treatment of types. Variables of this name would also become
 possible subject to the same sort of restrictions that apply to
 variables with the same name as the intrinsic type, c.f. real.
 
 
 3 Proposed Edits to IS 1539 : 1991     
 These edits are proposed as an indication to the editorial
 committee as to the sort of changes that would be necessary to
 implement these proposals. They too are presented in two parts.
 The first part must be included if the second is implemented but
 the second could be omitted and still leave a functionally useful
 addition to the language.
 
 3.1 Keywords in constructors edits
 
 3.1.1 4.4.4   [37/5]
     Replace "expr-list" with "comp-expr-list"
     Add 
 R430.1   comp-expr is   [component-name=]expr
 
 Constraint:   Each component-name must be the name of a
                component specified in the type definition for
                the type-name.
 
 Constraint:   The component-name= may be omitted only if it has
                been omitted from each preceding comp-expr in the
                comp-expr-list. 
 
 3.1.2 4.4.4   [37/7]
     After "type." add sentence
 The correspondence between expression and component may be
 indicated by the component name appearing explicitly in the form
 of a keyword in a manner similar to procedure argument
 association (12.4.1).
 
 
 3.2 Constructor as generic function
 
 3.2.1  4.4.4  [37/2]
     After "corresponding" add "generic function reference that
 is a"
 {{{      I believe that this is actually the only essential edit
           required to allow the functionality required.                  }}}
 
     The following are a set of proposed edits which implement
 this proposed clarification and reclassification of names.
 
  3.2.2 14.1.2 [241/23-25]
     Replace item (1) in the list by the following and renumber
 the list.
 (1) named variables that are not statement entities (14.1.3),
      named constants, statement functions, internal procedures,
      module procedures, dummy procedures, specific names for
      intrinsic procedures and namelist group names,
 
 (2) named control constructs,
 
 (3) type names, intrinsic and derived,
 
 (4) generic identifiers,
 
 3.2.3 14.1.2       [241/33]
     Delete ",except in ...generic names (12.3.2.1)"
 
 3.2.4 14.1.2       [241/35]
     Before "intrinsic" add "specific"
               [242/10]
     After "SIN" add "with a default real argument", before
 "intrinsic" add "specific"
               [242/11]
     Add sentence
     "A reference with an argument type of complex or any real
 kind other than the default real still refers to the generic
 intrinsic functions identified by SIN."
 
 3.2.5 14.1.2.3     [243/13+]
     Add paragraph
 If an interpretation of a name exists as a reference to a class 1
 entity, this is used instead of a generic reference via the same
 name even if such an interpretation was possible. For example, in
 the following program fragment
 
 ...
 REAL :: REAL(5),X,Y
 COMPLEX :: Z
 ...
 X = REAL(2)
 Y = REAL(2.0)
 Z = REAL((X,Y))
 ...
 
 is a valid program, where X is assigned the value of the second
 element of the array REAL, Y is assigned the value produced by
 invoking the generic type conversion intrinsic function REAL with
 a real argument, and Z is set to the value produced by calling
 the same generic function with a complex argument. The variant of
 the generic function that in the absence of the array REAL would
 convert an integer to a default real, is rendered inaccessible by
 the declaration of the array.
-----------------------------------------------------------
-- 
Dr.J.L.Schonfelder
Director, Computing Services Dept.
University of Liverpool, UK
Phone: +44(51)794 3716
FAX  : +44(51)794 3759
email: jls@liv.ac.uk   

