From owner-sc22wg5+sc22wg5-dom8=www.open-std.org@open-std.org Tue Sep 15 22:43:32 2015 Return-Path: X-Original-To: sc22wg5-dom8 Delivered-To: sc22wg5-dom8@www.open-std.org Received: by www.open-std.org (Postfix, from userid 521) id C8A7935688C; Tue, 15 Sep 2015 22:43:32 +0200 (CEST) Delivered-To: sc22wg5@open-std.org Received: from mail.jpl.nasa.gov (mailhost.jpl.nasa.gov [128.149.139.105]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by www.open-std.org (Postfix) with ESMTP id 61FAE356E3E for ; Tue, 15 Sep 2015 22:43:27 +0200 (CEST) Received: from [137.79.7.57] (math.jpl.nasa.gov [137.79.7.57]) by smtp.jpl.nasa.gov (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id t8FKhN0I008180 (using TLSv1.2 with cipher DHE-RSA-AES128-GCM-SHA256 (128 bits) verified NO) for ; Tue, 15 Sep 2015 13:43:25 -0700 Subject: Re: (j3.2006) (SC22WG5.5561) LCPC conference in Raleigh From: Van Snyder Reply-To: Van.Snyder@jpl.nasa.gov To: sc22wg5 In-Reply-To: <0B2E6F4E-134C-4142-B0BC-3EFE6BCA0458@cray.com> References: <20150915023108.BA4D6357225@www.open-std.org> <0B2E6F4E-134C-4142-B0BC-3EFE6BCA0458@cray.com> Content-Type: multipart/alternative; boundary="=-f1omlQ12HZg6d/l1dkfC" Organization: Yes Date: Tue, 15 Sep 2015 13:43:23 -0700 Message-ID: <1442349803.13944.195.camel@math.jpl.nasa.gov> Mime-Version: 1.0 X-Mailer: Evolution 2.32.3 (2.32.3-34.el6) X-Source-Sender: Van.Snyder@jpl.nasa.gov X-AUTH: Authorized Sender: owner-sc22wg5@open-std.org Precedence: bulk --=-f1omlQ12HZg6d/l1dkfC Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit On Tue, 2015-09-15 at 13:37 +0000, Bill Long wrote: > > Many speakers remarked that > > multigrain parallelism gives greater speed-up. Some speakers > mentioned > > fork-join constructs. Others mentioned tasks and threads (I don't > know > > what distinctions they drew between these). Somebody mentioned > futures. > > My experience is biased by SLURM, but I usually assume task -> image, > and thread -> SMP thread within an image to support local parallelism > (OpenMP, DO CONCURRENT, Async I/O, …). Ada tasks don't fit the image model well. Maybe they're closer to threads. "Concurrent and Real-Time Programming in Ada" by Burns and Wellings describe yet another construct, which I think arrived in Ada 95, that seems more like a persistent task that can be reactivated where it suspended, instead of being destroyed and recreated. More like a coroutine than a subroutine. Univac 1100 Exec had (at least) five API's for parallelism: Fork, Exit, Activate, Deactivate, and Wait. Fork created what we would today call a thread. Exit destroyed it. Deactivate put a thread to sleep without destroying it. Activate restarted a sleeping thread. Wait waited for a thread to deactivate or exit. There was also a Name API that would get a thread's name so that another thread could wait for it or activate it. Between, Fortran can do DACT with an event wait, and ACT with an event post. To address unstructured problems (graph, mesh, sparse matrix), where parallelism opportunities depend more upon the data presented than upon the properties of the algorithm, we need two more parallelism constructs between DO CONCURRENT and coarray images: a fork-join construct, which one can fake with a SELECT CASE inside DO CONCURRENT, and either a "spawn" construct or a task unit a la ada. One of the presenters (it might have been Hadia Ahmed) described persistent MPI transactions. Does anybody's coarray implementation use this (or its equivalent in a different transport mechanism)? --=-f1omlQ12HZg6d/l1dkfC Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 7bit On Tue, 2015-09-15 at 13:37 +0000, Bill Long wrote:

> Many speakers remarked that
> multigrain parallelism gives greater speed-up. Some speakers mentioned
> fork-join constructs. Others mentioned tasks and threads (I don't know
> what distinctions they drew between these). Somebody mentioned futures.

My experience is biased by SLURM, but I usually assume task -> image, and thread -> SMP thread within an image to support local parallelism (OpenMP, DO CONCURRENT, Async I/O, …).

Ada tasks don't fit the image model well. Maybe they're closer to threads. "Concurrent and Real-Time Programming in Ada" by Burns and Wellings describe yet another construct, which I think arrived in Ada 95, that seems more like a persistent task that can be reactivated where it suspended, instead of being destroyed and recreated. More like a coroutine than a subroutine.

Univac 1100 Exec had (at least) five API's for parallelism: Fork, Exit, Activate, Deactivate, and Wait. Fork created what we would today call a thread. Exit destroyed it. Deactivate put a thread to sleep without destroying it. Activate restarted a sleeping thread. Wait waited for a thread to deactivate or exit. There was also a Name API that would get a thread's name so that another thread could wait for it or activate it. Between, Fortran can do DACT with an event wait, and ACT with an event post. To address unstructured problems (graph, mesh, sparse matrix), where parallelism opportunities depend more upon the data presented than upon the properties of the algorithm, we need two more parallelism constructs between DO CONCURRENT and coarray images: a fork-join construct, which one can fake with a SELECT CASE inside DO CONCURRENT, and either a "spawn" construct or a task unit a la ada.

One of the presenters (it might have been Hadia Ahmed) described persistent MPI transactions. Does anybody's coarray implementation use this (or its equivalent in a different transport mechanism)?

--=-f1omlQ12HZg6d/l1dkfC--