LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-07-27 14:43:27


On Jul 24, 2004, at 2:23 PM, Martin Wood wrote:

> I am working on creating a fault tolerance case. I started with the
> lam/examples/fault, although a recent posting has thrown some doubt on
> this.
>  
> I would like to have a MASTER (M) , that spawns, say, 3 processes, A,
> B and C. I also want all the processors to run on different nodes, so
> 4 nodes in this case.
>  
> I believe MPI_COMM_SPAWN creates 3 seperate intercommunicators for A,
> B and C, such that I will have M->A  ; M->B and M->C. That should be
> fine assuming SPAWN will allow going to different nodes - will it ?

Yes, it will. See the man page for MPI_Comm_spawn(3) -- it should
explain the scheduling details; you can use specific MPI_Info keys to
indicate where you want processes launched, if you want.

Whether they end up in 3 different communicators depends on how you
invoke SPAWN. If you invoke SPAWN 3 times (one each for A, B, C), then
yes, they will be in 3 communicators. But if you invoke SPAWN once,
then you'll get one intercommunicator out containing all the parent
processes and the three spawned processes.

> Now I would like to create an intercommunicator for  each of C -> A
> and C->B. 

If you did a single SPAWN, then you have an intercommunicator
containing these three processes, and, in fact, they're all in a common
COMM_WORLD. Do you need an intercommunicator for a specific reason?
If you really need an intercommunicator, and all of A, B, and C are in
a single COMM_WORLD, you could probably use MPI_COMM_JOIN and/or
MPI_COMM_CONNECT/MPI_COMM_ACCEPT to make an intercommunicator.

> I looked at MPI_INTERCOMM_MERGE, but it talks of merging INTRA
> communicators to create an INTER communicator, but A, B and C are all
> INTER-comm ?  So I looked at maybe forming some "groups", but its not
> clear what "group" A,B and C belong to, since all seem to have their
> own MPI_WORLD_COMM.?

This depends on how you are invoking SPAWN, per above.

Hope this helps!

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/