From owner-sc22wg5@open-std.org  Tue Apr 19 17:04:13 2011
Return-Path: <owner-sc22wg5@open-std.org>
X-Original-To: sc22wg5-dom8
Delivered-To: sc22wg5-dom8@www2.open-std.org
Received: by www2.open-std.org (Postfix, from userid 521)
	id 8D9B1C178E3; Tue, 19 Apr 2011 17:04:13 +0200 (CET DST)
X-Original-To: sc22wg5@open-std.org
Delivered-To: sc22wg5@open-std.org
X-Greylist: delayed 399 seconds by postgrey-1.18 at www2.open-std.org; Tue, 19 Apr 2011 17:04:12 CET DST
Received: from engine19-1277-3.icritical.com (engine19-1277-3.icritical.com [93.95.13.95])
	by www2.open-std.org (Postfix) with SMTP id A3131C178DC
	for <sc22wg5@open-std.org>; Tue, 19 Apr 2011 17:04:12 +0200 (CET DST)
Received: (qmail 31504 invoked from network); 19 Apr 2011 14:57:28 -0000
Received: from localhost (127.0.0.1)
  by engine19-1277-3.icritical.com with SMTP; 19 Apr 2011 14:57:28 -0000
Received: from engine19-1277-3.icritical.com ([127.0.0.1])
 by localhost (engine19-1277-3.icritical.com [127.0.0.1]) (amavisd-new, port 10024)
 with SMTP id 31267-03 for <sc22wg5@open-std.org>;
 Tue, 19 Apr 2011 15:57:26 +0100 (BST)
Received: (qmail 31487 invoked by uid 599); 19 Apr 2011 14:57:26 -0000
Received: from unknown (HELO exchhub03.rl.ac.uk) (130.246.236.9)
    by engine19-1277-3.icritical.com (qpsmtpd/0.28) with ESMTP; Tue, 19 Apr 2011 15:57:26 +0100
Received: from jkr.cse.rl.ac.uk (130.246.9.202) by exchsmtp.stfc.ac.uk
 (130.246.236.17) with Microsoft SMTP Server id 14.1.270.1; Tue, 19 Apr 2011
 15:56:50 +0100
Received: from jkr.cse.rl.ac.uk (localhost.localdomain [127.0.0.1])	by
 jkr.cse.rl.ac.uk (Postfix) with ESMTP id BE084560D5;	Tue, 19 Apr 2011
 15:56:50 +0100 (BST)
Message-ID: <4DADA2B2.9060001@stfc.ac.uk>
Date: Tue, 19 Apr 2011 15:56:50 +0100
From: John Reid <John.Reid@stfc.ac.uk>
Reply-To: <John.Reid@stfc.ac.uk>
Organization: Rutherford Appleton Laboratory
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090908 Fedora/1.1.18-1.fc10 SeaMonkey/1.1.18
MIME-Version: 1.0
To: WG5 <sc22wg5@open-std.org>, Tobias Burnus <burnus@net-b.de>
Subject: Re: (j3.2006) Comments to 10-166 (early coarray TR draft)
References: <4D960451.7080909@net-b.de>
In-Reply-To: <4D960451.7080909@net-b.de>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Received-SPF: None (EXCHHUB01.fed.cclrc.ac.uk: John.Reid@stfc.ac.uk does not
 designate permitted sender hosts)
X-Virus-Scanned: by iCritical at engine19-1277-3.icritical.com
Sender: owner-sc22wg5@open-std.org
Precedence: bulk

Tobias,

Thanks very much for this comment, which we will take into account when 
considering the coarray TR during the meeting in June.

I would like to draw your attention to the WG5 paper N1835. Do you agree that 
your comment is essentially in support of Bill Long's proposal 1? Do you have 
any comments on the other proposals?

With best wishes,

John.

> admittedly, it is probably a bad timing as everyone is interested in TR 
> 29113 and not other work items. However, I happened to have time to 
> glance at 10-166 (draft of coarray TR, dated 2010/02/18).
> 
> First, I spotted an "ALL STOP" which should be an ERROR STOP (in A.1.1).
> 
> Secondly, I miss a possibility to broadcast values to all (or to a 
> team); unless I have missed something even with TR one has still to do do:
> 
>   if (this_image()==1) then
>     ! READ input file
>     ! Distribute values:
>     do image = 2, num_images()
>       z[image] = z
>     end do
>   end do
>   SYNC ALL
> 
> (Or in the "IF" a "SYNC IMAGES(*)" and in ELSE a "SYNC IMAGES(1)"). I 
> think sending the value to each other image, image by image, is rather 
> slow if many images are involved. (Assume a calculation on 6k Blue Gene 
> processors or using the full 294,912 processors of the HPC system 600 
> metres from here.) On such systems, sending the configuration can then 
> take a significant amount of the total computation time. That time is 
> wasted especially as there is a dedicated collective network with 
> one-to-all broadcast functionality.
> 
> For reductions, the draft provides the most important ones. However, I 
> see again some unneeded communication as: "A collective subroutine is 
> one that is invoked on a team of images to perform a calculation on 
> those images and which assigns the value of the result on all of them" 
> (4.1.1). While that is often the desired result, one frequently needs 
> the result only at one image. Coming again back to calculation on a 
> many-processor system: Doing the collective operations in a tree-like 
> manner and sending it to a single reduction-master image is faster than 
> collecting it on all systems - especially since there is a barrier (team 
> synchronization) after the reduction, which could be avoided on all but 
> the one image which is interested in the reduction.
> 
> Tobias
> 
> PS: It would be nice if someone could save the three comments such that 
> they can be discussed, when the topic comes up again after TR 29113.
> 
> PPS: I hope I have found the latest draft.
> _______________________________________________
> J3 mailing list
> J3@j3-fortran.org
> http://j3-fortran.org/mailman/listinfo/j3

-- 
Scanned by iCritical.
