From owner-sc22wg5@open-std.org Thu Nov 6 00:28:03 2008 Return-Path: X-Original-To: sc22wg5-dom7 Delivered-To: sc22wg5-dom7@www2.open-std.org Received: by www2.open-std.org (Postfix, from userid 521) id 1E7D5CA343A; Thu, 6 Nov 2008 00:28:03 +0100 (CET) X-Original-To: sc22wg5@open-std.org Delivered-To: sc22wg5@open-std.org Received: from ppsw-5.csi.cam.ac.uk (ppsw-5.csi.cam.ac.uk [131.111.8.135]) by www2.open-std.org (Postfix) with ESMTP id 556F3CA3434 for ; Thu, 6 Nov 2008 00:28:01 +0100 (CET) X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from hermes-1.csi.cam.ac.uk ([131.111.8.51]:54377) by ppsw-5.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.155]:25) with esmtpa (EXTERNAL:nmm1) id 1Kxrn7-00036E-HE (Exim 4.70) for sc22wg5@open-std.org (return-path ); Wed, 05 Nov 2008 23:28:01 +0000 Received: from prayer by hermes-1.csi.cam.ac.uk (hermes.cam.ac.uk) with local (PRAYER:nmm1) id 1Kxrn7-0004fA-AY (Exim 4.67) (return-path ); Wed, 05 Nov 2008 23:28:01 +0000 Received: from [83.67.89.123] by webmail.hermes.cam.ac.uk with HTTP (Prayer-1.3.1); 05 Nov 2008 23:28:01 +0000 Date: 05 Nov 2008 23:28:01 +0000 From: "N.M. Maclaren" To: sc22wg5 Cc: sc22wg5 Subject: Re: [ukfortran] (SC22WG5.3622) A comment on John Wallin's comments on Nick MacLaren's comments Message-ID: In-Reply-To: <20081105225653.DCEA7CA3428@www2.open-std.org> References: <20081105225653.DCEA7CA3428@www2.open-std.org> X-Mailer: Prayer v1.3.1 Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Sender: owner-sc22wg5@open-std.org Precedence: bulk John Wallin wrote: > > Here is a re-post of my response as per John Reid's request - with a few > minor changes. I don't think that you are acceptable to the SC22WG5 list! Aren't modern computers wonderful? > > So serial Fortran will not go away any time soon, and some people will > > positively want coarray-free compilers (or a fixable mode). > > I think that most people would not want to use a code that only runs on > a single processor of a larger machine if they had access to even a > basic PC. The performance of the low end PC with multiple cores will > almost always trump the use of a single core on a larger multi-core > machine. Yes and no. There are lots of reasons that they do - though I agree that pure performance isn't one of them. > 1) You can lower the priority of the jobs using the renice command. ... Priorities haven't worked in 20 years :-( Are you in Tokyo? Ask me why, if you are. > 2) It would seem sensible to be able set user limits on the number of > images available. This is clearly an implementation issue, not > something intrinsic to the language. (See the note by Alex) Been there - done that :-( Unfortunately, programs written assuming multiple images won't necessarily work if restricted to a single image or even two. Parallel programmers count one, two, many .... > If you have any such examples that apply to distributed memory systems > > without special RDMA/coarray hardware, please Email them to me. > > MPI is a distributed memory system, so my treecode runs on everything. > > Here is a simple example of an MPI coding horror. I have to pass a > Fortran derived type between nodes. Oh, the derived type issue. It's not fundamental to MPI, but is mainly because MPI is currently a Fortran 77 interface. A proper Fortran 2003 interface would be a lot better. It doesn't affect programs that don't use derived types, of course. Anyway, I regard the right solution to this is that every derived type should have code and decode primitives, and that you should transfer the encoded form. You need them for a great many purposes, including unformatted I/O, not just MPI. > The important thing to note is that every time a grad student adds a > single element to the data structure, you have to alter the block > counts and sizes by hand. Sorry, but no. There are better solutions. I agree that none are pretty, and all involve enforcing strict discipline on the programmers, but they have been known since time immemorial. Parameterisation is the key. > In short, coarrays would make my head hurt less. Most of the users I have dealt with have backed off shared-memory paradigms when they found that they couldn't debug or tune them, and gone back to MPI. The problem is that there are, and can be, no tools to trap race conditions. > With the new multicore/multi-box architectures, we actually need a > change in the language to write codes for these machines. Agreed. Now, whether coarrays are that change, I shall not say .... > I would suggest talking to the UPC forum about this. They have a lot > of experience with this, and can address it directly. Not much, actually. There aren't many versions, and UPC isn't much used by real scientists. Also, my investigations indicated that it doesn't actually do what most people think that it does. I can send you the document and test program if you want. > It is very possible to create codes with deadlocks on any machine. Of > course, higher latency combined with lower bandwidth brings this > problems to the forefront more quickly. However, this would seem to be > a problem with the program rather than the language. Non-synchronized > memory doesn't behave well, so users need to beware. Similar problem > exist in all parallel languages. I am talking about programs that have no deadlocks in, but where they are introduced by the implementation. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1@cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 b