On Oct 8, 2004, at 8:50 AM, Vinod Kannan wrote:
> I think I should have made it more clear.
> Case 1: parent spawns (not spawn_multiple's) child1
> child2. How do child1 talk to child2?
The problem we are having here is difference of terminology. See below.
> Case 2: parent spawns of child1 child2 child3 using
> spawn_multiple. I create groups and related
> communicators with the children using comm_world.
> Later child3 dies. parent spawns/spawn_multiple's
> child4 to compensate. How does child4 communicate with
> peer's child1 & child2?
See below.
> Jeff, if I understand you right, "the output of SPAWN
> and SPAWN_MULTIPLE are the same: an intecommunicator
> spanning *all* the children and parents. ".
All children that were spawned via that call, yes. Specifically, SPAWN
and SPAWN_MULTIPLE are collective over the parents and the children
being spawned by that one function invocation. So I wasn't specific; I
assumed you were only talking about one SPAWN/SPAWN_MULTIPLE
invocation.
> That doesnt seem to be the case. Successive spawn()'s
> seem to create different intercommunicators between
> parent and child .ie. parent has a different
> intercommunicator with each child.
Correct. So if you call MPI_COMM_SPAWN twice, you'll get 2 intercomms
out. The first one will contain the parents and children of the first
SPAWN; the second will contains the parents and children of the second
SPAWN.
As you have deduced, the children between the two spawns are not
connected in an MPI sense -- they cannot directly communicate without
further setup.
The solution to your problem is like to be in one of the other MPI-2
dynamic scenarios -- you have can MPI processes accept/connect to each
other (analogous to TCP sockets). Hence, your children processes can
connect/accept to each other to establish MPI communication. Since LAM
is single-threaded, you'll need to do an elaborate dance to ensure that
you don't deadlock, but such things are possible. Check the MPI-2
standard in the chapter about dynamic processes -- read up on name
publishing and subscribing as well as MPI_COMM_CONNECT and
MPI_COMM_ACCEPT.
> {comm_get_parent(commParent) + comm_size(commParent) }
> OR comm_size(comm_world) both return 1 with rank of 0
> for each and every child, under any case using spawn
> (with spawn_multiple, the same calls return a size
> equal to number of children with appropriate ranks).
>
> In the same context, my understanding is node-failure
> can be detected by handling LAM_SIGSHRINK. How can we
> detect process failure on a live node (say someone
> kills off the process inadvertantly). In the process
> failure scenario too Lam does not crash if we have
> used only one mpirun ( all processes except one have
> been spawned by the one started by mpirun). My
> understanding is doing a test on a send/receive
> request can be a deterministic way of process failure
> detection. Is there any robust way provided by the Lam
> system to detect process failure on a live node?
Yes and no. LAM's FT detection is quite rudimentary (real FT is quite
a complex problem, particularly within the constraints of MPI
semantics), and probably won't do what you want. Long term, we want
to enable this kind of behavior (if a process or a node dies, enable
the MPI and run-time environment to continue operating -- just
tolerating the failure), but our path forward with this will likely be
in Open MPI, not LAM/MPI.
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|