LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jim Lasc (jimlasc_at_[hidden])
Date: 2005-08-15 07:59:51


Sorry for my late reply, but I took some days off...

****************************** ****************
Keep in mind that when you INTERCOMM_MERGE, you're not adding processes
to an existing communicator -- you're getting an entirely new
communicator. So using this scheme, you'll have to ditch all your
prior requests and re-post them. This will likely be a fairly complex
scheme, because there's race conditions that may occur (depending on
your implementation) of messages that were sent on the old communicator
but not received, etc.

->And , is there a way to "merge" two communicators, or to add one process
to a communicator? I browsed through the MPI_Comm_x functions, and didn't
found anything usable ?!

Another option is to use pair-wise communicators, and just use an array
of requests that you WAITANY (or somesuch) on. Then the communicator
that each process is located in doesn't matter too much (except during
startup, addition, deletion, and shutdown) during normal operations of
sending and receiving.
-> Do you mean, when you have a ring where each node (process) can speak
with it's two neighbours,
that I should make a communicator (with "members": procnr-1 ;
procnr;procnr+1) for each node ?
But, in that case, I still need a communicator which contains all the
processes (to determine the procnr)? Is this correct?

Good. Let us know what you come up with as a final solution; there are
others trying to tackle similar problems.
--> OK. Once I'm done (begin september...)

Thanks for all your help.

On 8/12/05, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>
> On Aug 12, 2005, at 6:38 AM, Jim Lasc wrote:
>
> > > (6) When a nodenumber changes, and a message is between sender and
> > > receiver, I can consider the message as lost, correct?
> > > When finished I want it to be totally decentralised, so that the new
> > > node can connect with a node of his choise.
> >
> > To clarify: I don't know what you mean by "nodenumber" -- there is no
> > such thing. Every MPI process has a unique process rank in each
> > communicator that it is in. So if a process is in multiple
> > communicators (and, by definition, they are each in at least
> > MPI_COMM_WORLD and MPI_COMM_SELF), then they may have a different rank
> > in each communicator.
> > ->by nodenumber I mean MPI rank in the communicator which contains ALL
> > the processes/nodes (1 proces/node)
>
> Ok.
>
> MPI message delivery is guaranteed (unless the source or destination
> process dies *and* the MPI implementation is capable of handling such
> faults without aborting). Take the following example:
>
> - assume an MPI implementation that can handle process faults
> - a communicator contains 3 processes: A, B, C
> - A sends a message to C
> - C has not received the message yet
> - B dies
> - C can (and should) still eventually receive the message
>
> So no, messages are not lost.
>
> > > (7)This means I should open a port on every node from the
> > > "start-group", correct?
> >
> > I'm not sure what you mean here...?
> > -> the start group is the nodes from COM_WORLD (without any nodes
> > added...)
>
> Good. Let us know what you come up with as a final solution; there are
> others trying to tackle similar problems.
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>