From owner-sc22wg5@open-std.org Thu Jan 22 21:02:51 2009 Return-Path: X-Original-To: sc22wg5-dom7 Delivered-To: sc22wg5-dom7@www2.open-std.org Received: by www2.open-std.org (Postfix, from userid 521) id 7AB98CA5FED; Thu, 22 Jan 2009 21:02:51 +0100 (CET) X-Original-To: sc22wg5@open-std.org Delivered-To: sc22wg5@open-std.org Received: from ppsw-0.csi.cam.ac.uk (ppsw-0.csi.cam.ac.uk [131.111.8.130]) by www2.open-std.org (Postfix) with ESMTP id E8A45CA3439 for ; Thu, 22 Jan 2009 21:02:50 +0100 (CET) X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from hermes-2.csi.cam.ac.uk ([131.111.8.54]:36653) by ppsw-0.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.150]:25) with esmtpa (EXTERNAL:nmm1) id 1LQ5lJ-0007FF-34 (Exim 4.70) (return-path ); Thu, 22 Jan 2009 20:02:49 +0000 Received: from prayer by hermes-2.csi.cam.ac.uk (hermes.cam.ac.uk) with local (PRAYER:nmm1) id 1LQ5lJ-0005OO-UH (Exim 4.67) (return-path ); Thu, 22 Jan 2009 20:02:49 +0000 Received: from [83.67.89.123] by webmail.hermes.cam.ac.uk with HTTP (Prayer-1.3.1); 22 Jan 2009 20:02:49 +0000 Date: 22 Jan 2009 20:02:49 +0000 From: "N.M. Maclaren" To: MPI-3 Fortran working group , WG5 Subject: Re: [ukfortran] (SC22WG5.3897) [MPI3 Fortran] (j3.2006) MPI non-blocking transfers Message-ID: In-Reply-To: <20090122193721.D05D8CA5FE6@www2.open-std.org> References: <20090122175730.8BA1BCA3439@www2.open-std.org> <4978C20F.2070207@cray.com> <200901221111.01710.donev1@llnl.gov> <20090122193721.D05D8CA5FE6@www2.open-std.org> X-Mailer: Prayer v1.3.1 Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Sender: owner-sc22wg5@open-std.org Precedence: bulk On Jan 22 2009, Bill Long wrote: >Aleksandar Donev wrote: >> >> ! complex calculations using array to calculate values: >> call mpi_isend(array,...) >> ... ! array not modified here >> call mpi_wait(...) >> ! more calculations involving array reading the values >> >> Surely it is ok, desirable, or even necessary to optimize the >> calculations??? > >This example shows the problem well. There is no way for the compiler >to tell that the mpi_wait call is related to the isend that is using >array as its buffer. It might be for a different isend (and hence >different buffer) started somewhere else. The only safe thing for the >compiler to do is treat array as volatile throughout the subprogram >containing these two calls. Which is just what happens if array is >declared volatile. It's irrelevant whether you spell the attribute >asynchronous or volatile. The effect is the same. Yes, it shows the problem well - but the problem is that you don't seem to understand either the Fortran ASYCHRONOUS attribute or MPI non-blocking semantics. The compiler can optimise those calculations without concern, just as it can if you replace the MPI transfer by Fortran asynchronous I/O. If the MPI_Wait is unrelated to the MPI_Isend, then the user has made an error and the behaviour is UNDEFINED. That is EXACTLY the same situation as when you have illegal argument aliasing - no respectable optimising compiler refuses to optimise argument accesses for that reason. >Reread Nick's earlier message. The optimization pitfalls extend past >just argument passing. Sigh. Yes, they do. But my previous message was describing why the ASYNCHRONOUS attribute was essential, and omitting ANY attribute was not going to work (irrespective of how many procedure flags you use). Once you have that attribute, the problems that I described there have ALREADY been dealt with. There are only three things that an optimising compiler CAN'T do with a variable with the ASYNCHRONOUS attribute that it can do with one without any attribute: 1) Move it (including copy-in/copy-out), if it might be pending. 2) Move accesses across a construct (usually a procedure call) that might change it to or from pending state. 3) Copy and restore parts of it that are not accessed by the code the user wrote, if it might be pending. Compared with what it can't do to ones with the VOLATILE attribute, that's nothing. >> I understand no compiler actually implements asynchronous so >> no implementor has actually pinned down the list of what optimizations >> are "not allowed", > >I'm sure at least some compilers do implement asynchronous. Apparently Intel does, and it isn't exactly new technology, anyway! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1@cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679