On Feb 13, 2008, at 4:05 PM, fahad saeed wrote:
> Ok.Let me put it this way.How can i use message passing to execute a
> binary on different nodes.?
Message passing is not the same thing as launching an executable on
different nodes. Message passing is sending messages from one process
to another. The fact that mpirun *also* starts processes on multiple
nodes is a side-effect -- we have to start the processes before we can
exchange messages between them.
What it sounds like you want to do is use MPI as a bootstrap to launch
several executables on multiple nodes. While MPI can do that, it's
not really what it's designed for. And the fact that you call
MPI_INIT and then don't call MPI_FINALIZE will always cause errors
with LAM/MPI.
LAM has a launcher explicitly for non-MPI applications; you probably
want to use that instead. See the man page for lamexec(1).
>
> Fahad
>
> > From: jsquyres_at_[hidden]
> > To: lam_at_[hidden]
> > Date: Wed, 13 Feb 2008 16:02:16 -0500
> > Subject: Re: LAM: caused collective abort of all ranks
> >
> > On Feb 13, 2008, at 2:51 PM, fahad saeed wrote:
> >
> > > What would I have to do, if i have to use MPI.I mean isnt this
> true
> > > that MPI is for message passing, and the thing that i am trying to
> > > do is a kind of message passing ?
> >
> > I don't see you doing any message passing in your app -- you call
> > MPI_INIT and then execl() (thereby replacing the MPI process with
> > bash). There's no sending of messages anywhere.
> >
> > > Could you please eloborate more on resource manager, what kind of
> > > resource manager.
> >
> > A resource manager to allocate the nodes in your cluster, such as
> > SLURM or Torque, etc.
> >
> > --
> > Jeff Squyres
> > Cisco Systems
> >
> > _______________________________________________
> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> Connect and share in new ways with Windows Live. Get it now!
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
Jeff Squyres
Cisco Systems
|