LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2003-05-31 09:01:22


On Mon, 26 May 2003, Ndong-Nna Guitry-Evrard wrote:

> And we want to know, if we can use some tools and libraries of
> MPICH-1.2.5 with LAM-6.5.7.

Yes, but you will need to recompile them for LAM.

> It will very helpfull, if you can tell us how launch the lam daemons on
> a linux cluster. We have followed all the procedures of the
> installation's guide. Lam is installed in all the nodes. But when we
> tried to execute an example of mpi program, the task failed.
> This is the constant message, we receive after failure :
> ----------------------------------------------------------------------------
> It seems that there is no lamd running on this host, which indicates
> [snipped]
> ----------------------------------------------------------------------------
>
> This is the invokation,we made before :
>
> rsh $lastNode lamboot -s $lamconfile
> rsh $lastNode cd ~/test1/io
> /usr/local/lam-mpi/bin/mpirun -np $nnodes async -fname razoir
> rsh $lastNode lamhalt
> rm -f $lamconfile

There's a few problems with your command sequence listed there:

- each rsh is distinct, and bears no connection to the one prior to it
- for example, when you "rsh ... cd ...", the results of the cd are
  lost as soon as rsh completes
- the mpirun command is not executed on the node that you ran lamboot
  on (and evidently it is not a node that is listed in $lamconfile);
  this is specifically why you are getting the error -- LAM was not
  started on the node that you are running this script on
- it would be better to write all of these commands into a single
  script, and then run *that* script on $lastNode. For example, have
  a script named "mpi_stuff":

  lamboot -s $lamconfile
  cd ~/test1/io
  /usr/local/lam-mpi/bin/mpirun -np $nnodes async -fname razoir
  lamhalt
  rm -f $lamconfile

  and then run:

  rsh $lastNode mpi_stuff

That should work properly for you.

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/