From owner-sc22wg5+sc22wg5-dom8=www.open-std.org@open-std.org  Sun Dec  7 20:51:57 2014
Return-Path: <owner-sc22wg5+sc22wg5-dom8=www.open-std.org@open-std.org>
X-Original-To: sc22wg5-dom8
Delivered-To: sc22wg5-dom8@www.open-std.org
Received: by www.open-std.org (Postfix, from userid 521)
	id D78F73586E3; Sun,  7 Dec 2014 20:51:57 +0100 (CET)
Delivered-To: sc22wg5@open-std.org
X-Greylist: delayed 345 seconds by postgrey-1.34 at www5.open-std.org; Sun, 07 Dec 2014 20:51:56 CET
Received: from exprod6og104.obsmtp.com (exprod6og104.obsmtp.com [64.18.1.187])
	by www.open-std.org (Postfix) with ESMTP id 9EE6335732E
	for <sc22wg5@open-std.org>; Sun,  7 Dec 2014 20:51:47 +0100 (CET)
Received: from CFWEX01.americas.cray.com ([136.162.34.11]) (using TLSv1) by exprod6ob104.postini.com ([64.18.5.12]) with SMTP
	ID DSNKVISv0162n1kTryIUdePmBvjcxBGEBPEl@postini.com; Sun, 07 Dec 2014 11:51:56 PST
Received: from CFWEX02.americas.cray.com (172.30.74.25) by
 CFWEX01.americas.cray.com (172.30.88.25) with Microsoft SMTP Server (TLS) id
 14.2.347.0; Sun, 7 Dec 2014 13:46:04 -0600
Received: from CFWEX01.americas.cray.com ([169.254.1.243]) by
 cfwex02.americas.cray.com ([169.254.2.115]) with mapi id 14.02.0387.000; Sun,
 7 Dec 2014 13:46:03 -0600
From: Bill Long <longb@cray.com>
To: WG5 List <sc22wg5@open-std.org>
Subject: Re: (j3.2006) (SC22WG5.5361) Straw vote on draft DTS
Thread-Topic: (j3.2006) (SC22WG5.5361) Straw vote on draft DTS
Thread-Index: AQHP+4DRZYe/qRNCrUC4rq0FL+eRgJyFGk4A
Date: Sun, 7 Dec 2014 19:46:03 +0000
Message-ID: <9979B4BC-2BC0-4214-A871-9CF17312BA15@cray.com>
References: <20141108182113.6EC5E3581CE@www.open-std.org>
In-Reply-To: <20141108182113.6EC5E3581CE@www.open-std.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [192.168.233.246]
Content-Type: text/plain; charset="Windows-1252"
Content-ID: <42C77A3D24C5E745860234FA77428CB0@cray.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Sender: owner-sc22wg5@open-std.org
Precedence: bulk




Please answer the following question "Is N2033 ready for forwarding to=20
SC22 as the DTS?" in one of these ways.=20

1) Yes.
2) Yes, but I recommend the following changes.=20
3) No, for the following reasons.
4) Abstain.

-----------------------

Yes, but I recommend the following changes.

N2033: [14:23] Delete =93,without synchronization of coarray deallocations=
=94.

Tom Clune, and others since, have noted that this phrase increases the unce=
rtainty of how the recovery of a stalled image is expected to be implemente=
d.  Additionally, it conflicts with a basic tenant of coarrays that the exi=
stence of a coarray should be consistent across the images where the coarra=
y was allocated   If a stalled image prematurely deallocates a coarray, acc=
esses from an active image might produce nonsense results, or even fail.  T=
his would be an undesirable exception to our normal rules.=20

-------------------------

Additional general comments:

Nick explained the rationale behind the stalled image classification.  I wo=
uld just add one background note.  Most of the modes of inter-image activit=
y involve statements (image control statements or calls to intrinsics) that=
 have an optional STAT=3D specifier or STAT argument.  In those cases, an a=
bnormal state can be detected by a programmer and explicitly acted upon wit=
h statements in the program.  If the program fails to use these facilities =
(no STAT=3D specified, or omits the optional STAT argument) and an error co=
ndition occurs, the program aborts, as has long been the case.   The one ex=
ception to this model is a simple reference or definition of a variable on =
a remote image using the image-selector syntax.   There is no =93STAT=94 me=
thod available there, nor would it make much sense, since the designator th=
at includes the image selector could be in many places of a complicated exp=
ression or statement.  The stalled image facility addresses this case, plug=
ging an otherwise serious hole.=20

There is substantial opinion that implementing stalled image recovery is no=
t easy. I do not disagree.  In simplest terms, it is equivalent to implemen=
ting the infrastructure to handle an exception handling mechanism.  It is a=
 bit simpler - the handler is basically internal to the runtime rather than=
 user-specified, and if the relevant END TEAM statement lacks a STAT=3D spe=
cifier, the code would end up aborting anyway, so there is no need to do mu=
ch before then.  However, the basic process of unwinding the call stack (if=
 there is one) that grew after the CHANGE TEAM statement execution is more =
or less the same as for an exception handler.  Given that exception handler=
s already exist in other languages, and certainly at the system level, the =
argument that implementors do not know how to do this seems weak at best.  =
I understand grumbling about hard work, not claims of inability.=20

The more general question of whether Fortran should include fault tolerance=
 on a timely schedule at all is really a question Fortran=92s future releva=
nce in the HPC market place. And that is the only market where Fortran has =
a significant fraction of programming language mindshare.  The need for thi=
s capability is in the 2018-2020 =93exascale=94 time frame.  If we miss tha=
t window, we=92re seriously disadvantaged. The Fortran 2015 standard (with =
compilers available ~2018) is our last opportunity to meet the schedule.  A=
lternatives like MPI and SHMEM are actively making progress in this area, r=
ealizing the same target dates are looming.=20

The idea that vendors need to implement a facility like fault tolerance bef=
ore including it in the standard is out of touch with the realities of mode=
rn-day compiler development.  It might have been viable in the past, but to=
day=92s compiler vendors will implement a feature AFTER is it in the standa=
rd, not before.  Not only is this an economic reality, but also a positive =
for program portability.  In many cases from the past where vendors impleme=
nt new facilities outside the standard, the features end up being =93extens=
ions=94 that don=92t go away but perpetually lead to non-portable code for =
programmers who use them.  On platforms with multiple Fortran compilers, th=
is is a recurring frustration.=20

Finally, Tobias raised,  and Malcolm elaborated and provided details on the=
 issue of finalization in the context of CO_BROADCAST and (especially) CO_R=
EDUCE.  This issue is a side effect of the introduction of intrinsic subrou=
tines that allow INTENT(INOUT) arguments of types that specify finalization=
.  This case was not envisioned  (or relevant) when the current  "4.5.6.3 W=
hen finalization occurs=94 was written. Modification to the TS to account f=
or this would be in Clause 8.  I see this as essentially an integration iss=
ue.  While this is important,  the TS process also does allow for subsequen=
t modifications during integration, so I don=92t see this as an issue that =
should block the TS from progressing to a vote.=20


Cheers,
Bill



Bill Long                                                                  =
     longb@cray.com
Fortran Technical Suport  &                                  voice:  651-60=
5-9024
Bioinformatics Software Development                     fax:  651-605-9142
Cray Inc./ Cray Plaza, Suite 210/ 380 Jackson St./ St. Paul, MN 55101


