> (6) When a nodenumber changes, and a message is between sender and
> receiver, I can consider the message as lost, correct?
> When finished I want it to be totally decentralised, so that the new
> node can connect with a node of his choise.
To clarify: I don't know what you mean by "nodenumber" -- there is no
such thing. Every MPI process has a unique process rank in each
communicator that it is in. So if a process is in multiple
communicators (and, by definition, they are each in at least
MPI_COMM_WORLD and MPI_COMM_SELF), then they may have a different rank
in each communicator.
->by nodenumber I mean MPI rank in the communicator which contains ALL the
processes/nodes (1 proces/node)
.
> (7)This means I should open a port on every node from the
> "start-group", correct?
I'm not sure what you mean here...?
-> the start group is the nodes from COM_WORLD (without any nodes added...)
Thanks for your answers, They made a lot clearer to me.
Jim
On 8/11/05, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>
> On Aug 10, 2005, at 4:03 PM, Jim Lasc wrote:
>
> > Now I want to add nodes.
> > With adding a node I mean the following:
> > connecting a computer which is unknown at the time of startup (one I
> > just bought, for example) to the ring, and allowing him (the new
> > node) to speak
> > with his neighbour-nodes.
> >
> > (1)How should I implement that (see below...)?
> > (2)when I use MPI_Spawn, I can't "say" that it has to be spawned on
> > the new node, because MPI decides itself where to spawn,is this
> > correct?
>
> Each MPI implementations' inner workings of MPI_COMM_SPAWN are likely
> to be different. LAM allows some degree of placement of processes on
> nodes -- see LAM's MPI_Comm_spawn(3) man page.
>
> > (3)So I should use MPI_Open_port on a "master-node" and connect the
> > new node with the master-node, correct?
>
> That is one way to do it, yes.
>
> > And, MPI_comm_accept is blocking, so if I want the new node to be
> > able to connect on every moment,
> > (4)I should use a thread solemny for the MPI_Comm_accept, is this
> > correct?
>
> That is also a common way to do it. However, be aware that your MPI
> implementation must be thread safe to do this. LAM/MPI is not. We
> demonstrated exactly this, however, last year at SC with Open MPI.
> Open MPI, unfortunately, is not yet available to the public. :-\
>
> > (5) when I use MPI_Intercomm_merge, is there a way to say that I want
> > the nodes 0-n to keep their rank, and that I want the new node to have
> > rank n+1 ?
> > because (see above) I posted a lot of IRecv's at the startup-phase
> > (and, the IRecv's are reposted once they are filled),
> > so I prefer only having to change the IRecv's from node 0 and n
> > instead of all the IRecv's
> > (and, this gives less problems for messages which are between sender
> > and receiver)
>
> Keep in mind that when you INTERCOMM_MERGE, you're not adding processes
> to an existing communicator -- you're getting an entirely new
> communicator. So using this scheme, you'll have to ditch all your
> prior requests and re-post them. This will likely be a fairly complex
> scheme, because there's race conditions that may occur (depending on
> your implementation) of messages that were sent on the old communicator
> but not received, etc.
>
> Another option is to use pair-wise communicators, and just use an array
> of requests that you WAITANY (or somesuch) on. Then the communicator
> that each process is located in doesn't matter too much (except during
> startup, addition, deletion, and shutdown) during normal operations of
> sending and receiving.
>
> > (6) When a nodenumber changes, and a message is between sender and
> > receiver, I can consider the message as lost, correct?
> > When finished I want it to be totally decentralised, so that the new
> > node can connect with a node of his choise.
>
> To clarify: I don't know what you mean by "nodenumber" -- there is no
> such thing. Every MPI process has a unique process rank in each
> communicator that it is in. So if a process is in multiple
> communicators (and, by definition, they are each in at least
> MPI_COMM_WORLD and MPI_COMM_SELF), then they may have a different rank
> in each communicator.
>
> > (7)This means I should open a port on every node from the
> > "start-group", correct?
>
> I'm not sure what you mean here...?
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|