From owner-sc22wg5@open-std.org Fri Mar 20 23:45:25 2009 Return-Path: X-Original-To: sc22wg5-dom7 Delivered-To: sc22wg5-dom7@www2.open-std.org Received: by www2.open-std.org (Postfix, from userid 521) id 68E98C76BB5; Fri, 20 Mar 2009 23:45:25 +0100 (CET) X-Original-To: sc22wg5@open-std.org Delivered-To: sc22wg5@open-std.org X-Greylist: delayed 784 seconds by postgrey-1.18 at www2.open-std.org; Fri, 20 Mar 2009 23:45:24 CET Received: from smtp.llnl.gov (nspiron-3.llnl.gov [128.115.41.83]) by www2.open-std.org (Postfix) with ESMTP id 754F9C76BB3 for ; Fri, 20 Mar 2009 23:45:24 +0100 (CET) X-Attachments: None Received: from cyrus2.llnl.gov ([128.15.97.105]) by smtp.llnl.gov with ESMTP; 20 Mar 2009 15:32:06 -0700 From: Aleksandar Donev Organization: LLNL To: Van.Snyder@jpl.nasa.gov, sc22wg5 Subject: Re: (j3.2006) (SC22WG5.3961) [Fwd: Fortran concurrency memory model vs. its C++ counterpart] Date: Fri, 20 Mar 2009 15:32:05 -0700 User-Agent: KMail/1.9.4 References: <20090320215810.99DEDC76BB3@www2.open-std.org> In-Reply-To: <20090320215810.99DEDC76BB3@www2.open-std.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200903201532.05764.donev1@llnl.gov> Sender: owner-sc22wg5@open-std.org Precedence: bulk Some answers: On Friday 20 March 2009 14:57, Van Snyder wrote: > 1) I couldn't find a clear description of the semantics of a data > race.=20 We do, only the words definition and race are not used, rather, it is a=20 bunch of restrictions. > If I access a coarray element while someone else is writing it,=20 > what are the possible outcomes? It is not allowed to do that. We do not prescribe what happens with=20 non-conforming programs---the compiler/RTL can do whatever. > =A0=A0=A0=A0=A0=A0=A0=A0c) The program behavior becomes undefined. =A0Thi= s is the > approach taken by Posix and C++0x. =A0Mostly allows existing > optimizations. Yes, this is our choice too, though I do not know what Posix does=20 exactly. > 2) Presumably the intent is to prohibit the compiler from introducing > new "speculative" stores to co-arrays that may add data races? Correct, such "harmless" "optimizations" (in a serial context), can=20 cause problems with shared data and compilers must disable those. We=20 don't say such things explicitly in Fortran---the compilers need to do=20 the right thing to produce the correct answer for any conforming=20 program. If they do the above, they are in error. > C++ and C are expected to outlaw this, with exceptions > for sequences of contiguous bit-fields (which I assume don't exist in > Fortran?) =A0 They don't, yet :-) > 3) Fortran relies at least superficially on explicit memory fences > ("sync_memory") to provide memory ordering guarantees. =A0C++0x and > Java instead provide a "sequential consistency for data-race-free > programs" guarantee by default, with some esoteric constructs to > defeat that for performance.=20 By contrast, Fortran's default is performance and we have "some esoteric=20 constructs" to ensure some form of sequential consistency. >=A0(On X86, with shared memory, I'd expect a factor > of 10 or so difference between bracketing every atomic access with > sync_memory vs. the C++ approach.=20 Yes, but what is the cost of the implementation ensuring=20 sequential-consistency on, say, a cluster with 1000 nodes??? We hope not to see coarray codes with loads of sync memories. Use=20 another language for that---coarrays are meant to cover coarse-grained=20 parallelism for the most part. We don't even provide real atomic=20 features, just a simple load and store! > =A0(For example, you get into subtle issues as to whether data > dependencies are required to enforce memory ordering. =A0Programmers > tend to automatically assume yes. =A0Implementers tend to automatically > assume no.) These are indeed tricky, and we have argued over them at length.=20 However, again, the hope is that most (Fortran---recall that these tend=20 to be different from people such as yourselves!) programmers will not=20 dwelve into such depths. > I am actually hoping that hardware implementations gradually favor > the C ++/Java approach even more Perhaps for shared-memory one-machine type hardware. Certainly it won't=20 happen for large clusters anytime soon, will it? > 4) I'm not sure what, if anything, volatile means with respect to > concurrent access by multiple images.=20 Neither do we, and I do not care to know :-) I am not trying to be=20 annoying, it is just a question I have had to discuss on this list=20 waaaay too many times for no good reason or outcome. Probably, implementations will follow the C model, whatever that happens=20 to be. The standard itself will defer this as a processor-dependent=20 issue....I hope. Best, Aleks