From JLS@liverpool.ac.uk Wed Aug  5 13:04:45 1992
Received: from vm.uni-c.dk by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8)
	id AA12364; Wed, 5 Aug 92 13:04:45 +0200
Message-Id: <9208051104.AA12364@dkuug.dk>
Received: from vm.uni-c.dk by vm.uni-c.dk (IBM VM SMTP V2R2) with BSMTP id 3272;
   Wed, 05 Aug 92 13:05:10 DNT
Received: from UKACRL.BITNET by vm.uni-c.dk (Mailer R2.07) with BSMTP id 4760;
 Wed, 05 Aug 92 13:05:09 DNT
Received: from RL.IB by UKACRL.BITNET (Mailer R2.07) with BSMTP id 2480; Wed,
 05 Aug 92 11:54:27 BST
Received: from RL.IB by UK.AC.RL.IB (Mailer R2.07) with BSMTP id 6320; Wed, 05
 Aug 92 11:54:25 BST
Via:      UK.AC.LIV.IBM;  5 AUG 92 11:54:20 BST
Received:     from JLS@UK.AC.LIVERPOOL
              by MAILER(4.4.t);  5 Aug 1992 11:53:50 BST
Date:     Wed, 05 Aug 92 11:26:21 BST
From: Lawrie Schonfelder <JLS@liverpool.ac.uk>
Subject:  String Standard Issues
To: SC22/WG5 members <SC22WG5@dkuug.dk>
X-Charset: ASCII
X-Char-Esc: 29

This is the paper you asked for on the issues relating to the string standard.
Also I have read the SC22 ballot request that came from Joe Cote. As far as
I can see this is the three month CD letter ballot. I think if this passes
the document becomes a CD and it must then be processed as per member body
ballot comments with the view of preparing it for DIS balloting. If it fails
the document would remain a WD and would have to be revised to resolve the
NO votes and be resubmitted for CD balloting. The closing date for this ballot
is Aug-28.


                               ISO/IEC JTC1/SC22/WG5 - Nddd
From: Lawrie Schonfelder

                ISO VARYING STRING STANDARD

             Issues Raised at Victoria Meeting

1  The general issue of how best to make available the source text of the
   example module was raised. It was suggested that it would be better to
   publish a descriptive annex which explained the module structure and
   included some samples of the code but not the whole text; this being
   thought to be too voluminous to be useful. If this were done the full
   source text was to be made available electronically by some means.

   [Project Editor's comment: This change would I think be a substantial
   improvement to the document. However, it would take a significant
   amount of time to write the necessary descriptive annex which could
   delay the final move to DIS status.]

2  The issue of the generic name for the type conversion function
   converting from VARYING_STRING to CHARACTER was raised.
   The overloading of the intrinsic type conversion function CHAR was
   questioned again with the suggestion that some other name be used.

   [Project Editor's comment: This is an issue which was discussed first in
   Rotterdam and it was decided that the generic name for all functions
   converting values from another type into CHARACTER should be the
   same, namely CHAR. I am disturbed at what appears to be a tendency
   to revisit old issues.]

3  A question was raised as to the need for greater symmetry in the
   definition for VAR_STR with that of CHAR, in particular should
   VAR_STR have a version which includes a second length parameter.

   [Project Editor's comment: This appears to be a step entirely in the
   wrong direction. The length of the result is explicitly controllable by the
   use of substringing and/or concatenation to restrict or pad the length
   of the argument. The addition of extra arguments in the proposed way
   is unnecessary and undesirable. Symmetry should be obtained by
   removing the length argument from CHAR.]

4  Regularity with other type conversion functions, such as REAL, which
   include the identity conversion was raised with respect to VAR_STR;
   should a version of VAR_STR with a VARYING_STRING argument
   be included.

   [Project Editor's comment: if this is done then consistency should be
   checked across the board including all the type conversion intrinsic
   functions. For example, CHAR does not currently has a version for a
   CHARACTER argument, perhaps it should but identity operations of
   this sort do not appear to be very useful.]

5  The procedures for manipulating parts of strings, INSERT, REPLACE,
   REMOVE and EXTRACT, are currently all functions. It was suggested
   that INSERT, REPLACE and REMOVE would be better as
   subroutines.

   [Project Editor's comment: This is again a revisiting of an issue. This is
   a committee procedural problem. We need to establish a procedure to
   constrain such revisiting of issues.]

6  There was a suggestion that the limited set of I/O procedures was too
   restricted to be regarded as comprehensive on the one hand, and that
   on the other they were more elaborate than was necessary. The view
   was expressed that until and unless the main language is improved to
   allow proper abstraction of derived type I/O, a supplementary standard
   such as this should include only the minimum I/O facilities conducive
   with utility. To this end the READ_STRING procedures should be
   simplified to include only the "read to end of record" variants. The
   effect of the variants with SET and/or MAXLEN arguments can be
   obtained by the user employing simple alternative coding. For example,
   conventional input to a character variable or appropriate use of
   INDEX,SCAN and EXTRACT procedures.

   [Project Editor's comment: I would agree with this view.]

7  The essentially non-generic choice of names for the I/O procedures was
   also raised. It was pointed out that users and possibly other standards
   might wish to create similar modules for different types and could well
   want to have generic I/O procedures with a broadly similar role. They
   would logically wish to overload the existing standardised generic
   names. This would not be likely however, since the current names,
   READ_STRING, WRITE_STRING, WRITE_LINE indicate the type
   of their main argument in their name. They are not actually generic in
   character.
   [Project Editor's comment: the original names for these procedures of
   PUT, GET, and PUT_LINE were more generic in character.]

8  It was pointed out that many users will be surprised by the fact that a
   string such as "XX" and "XX   " will produce the result TRUE in a
   logical comparison for equality because of the blank padding rule. It
   was suggested that this rule although OK for CHARACTER was
   inappropriate for strings that had varying length.
   [Project Editor's comment: This is another revisit of a past issue.
   Although in principle this appears odd, incompatibility between string
   and character comparisons would be even more surprising to most
   users.]

9  A somewhat related issue also arose as to the exact semantics that
   should be used for the VAR_STR procedure. Given that
   CHARACTER variables are right padded with blanks when their length
   exceeds that of any value assigned to them it was suggested that these
   padding characters should be stripped off when converting to
   VARYING_STRING.
   [Project Editor's comment: Another revisit to a settled issue! It was
   previously agreed that since the user could easily trim off any padding
   by use of the intrinsic TRIM function it was better for the VAR_STR
   procedure to merely convert the type and make no change in value. As
   a general principle a single procedure should perform a well defined
   single function. Where some modification of result is required that can
   be obtained by a prior modification of the argument by use of some
   other procedure first this is preferable. Obtaining a string without
   padding is better done by a reference VAR_STR(TRIM(ch)) than
   changing the semantics of VAR_STR to always do this.]

10 By far the most controversial issue raised was what if anything should
   be done in the definition of the VARYING_STRING standard to
   attempt to overcome the potential inefficiencies resulting from the error
   in the main standard of undefined pointer initial status. Because of this
   feature of the main language it is impossible for the example module
   as written to control memory leakage. In fact, because of the initially
   undefined pointer error it is difficult to see how any simple module in
   Fortran 90 could easily be both memory efficient and safe. It was
   therefore suggested that because of this the VARYING_STRING
   standard and module should include INIT and FREE subroutines along
   with a requirement on the user that no string variable be used for any
   purpose until it had appeared first in a call to the INIT procedure.
   [Project Editor's comment: I believe this to be a thoroughly bad idea
   for a number of reasons.
   a)  The standard is intended to define the interface and functional
       semantics only. The expectation is that high quality
       implementations of Fortran 90 will implement the module
       interface as an integral part of their processor. They can therefore
       avoid the inefficiencies inherent in the example module which was
       not anyway written to be of optimal efficiency but merely as an
       illustrative existence proof that a portable implementation was
       possible.
   b)  The current module defines only facilities considered to be useful
       and functionally necessary for a user of varying strings. The
       technical details relating to particulars of the specific, indeed any,
       implementation are hidden. Routines such as INIT and FREE do
       not fall into such a category. They would not be seen by the
       average user as having any obvious connection with the job of
       manipulating strings of characters. For this reason users would not
       naturally use such routines and would very likely omit to do so in
       spite of documentation indicating that they must. A module that
       therefore depended on the assumption that they had would be very
       unsafe in practice. Garbage collecting an undefined pointer is not
       likely to improve the reliability of the program. With the standard
       as now defined the example module is relatively easy and natural
       to use and it makes no specialised assumptions about a user's
       program. It is safe, if possibly inefficient on some implementations.
       With a required INIT procedure it would become potentially more
       efficient but at the expense of assuming unnatural coding practices
       likely to make it highly unsafe.
   c)  It is not a sensible approach overall to try to build temporary
       workrounds in supplementary standards for inefficiencies caused by
       general defects in the main language. The general defect should be
       fixed and then the user of the supplementary standard obtains the
       benefit in increased efficiency without a change to either his
       programs or the supplementary standard.]

11 General Project Editor request.
   If member bodies intend to request changes to the standard, it would
   greatly expedite processing if the member body comments on the CD
   ballot could be phrased in terms of proposals to change the document
   along with suggested text. The project editor and development body
   could then prepare a comment response document as a set(s) of change
   proposals which could be balloted in WG5 by mail.
