LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Warner Yuen (wyuen_at_[hidden])
Date: 2004-10-17 14:18:26


Hello LAM folks:

I'm trying to get LAM-MPI to work with Myrinet. I read on the mailing
list that the only hope is to try one of the version on the SVN server.
So I tried it. but I can't seem to get lamboot to work. When I go back
and use LAM-7.0.6 it works fine. Any ideas on what's up?

For my configuration, I used:

./configure --with-rsh=/usr/bin/ssh --prefix=/hpc/tools/lam-gcc-7.2b/
--with-gm=/opt/gm --with-ib=/usr/mellanox

---------------------------My lamboot
error-----------------------------------------

node25:~/hpltest warner$ lamboot -v lammachines

LAM 7.2b1svn10172004/MPI 2 C++/ROMIO - Indiana University

n-1<19290> ssi:boot:base:linear: booting n0 (node25.cluster.private)
n-1<19290> ssi:boot:base:linear: booting n1 (node26.cluster.private)
------------------------------------------------------------------------
-----
The lamboot agent failed to read a message over a socket from the
newly-booted process. This should not happen (especially since TCP is
a guaranteed protocol).

Please check your network connectivity and ensure that messages can be
passed reliably over TCP. Additionally, ensure that the host where
the newly-booted process was launched is healthy and still available
on the network.
------------------------------------------------------------------------
-----
n-1<19290> ssi:boot:base:linear: aborted!
n-1<19296> ssi:boot:base:linear: booting n0 (node25.cluster.private)
n-1<19296> ssi:boot:base:linear: booting n1 (node26.cluster.private)
n-1<19296> ssi:boot:base:linear: booting n2 (node27.cluster.private)
n-1<19296> ssi:boot:base:linear: booting n3 (node28.cluster.private)
n-1<19296> ssi:boot:base:linear: booting n4 (node29.cluster.private)
n-1<19296> ssi:boot:base:linear: booting n5 (node30.cluster.private)
n-1<19296> ssi:boot:base:linear: booting n6 (node31.cluster.private)
n-1<19296> ssi:boot:base:linear: booting n7 (node32.cluster.private)
n-1<19296> ssi:boot:base:linear: finished
lamboot did NOT complete successfully

Warner Yuen
Research Computing Consultant
Apple Computer
email: wyuen_at_[hidden]
Tel: 408.718.2859
Fax: 408.718.0133