[wxm@pc2 lam]$ lamboot -v lamhosts LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University n-1<4600> ssi:boot:base:linear: booting n0 (192.168.33.202) n-1<4600> ssi:boot:base:linear: booting n1 (192.168.33.203) wxm@192.168.33.203's password: wxm@192.168.33.203's password: ----------------------------------------------------------------------------- The lamboot agent failed to read a message over a socket from the newly-booted process. This should not happen (especially since TCP is a guaranteed protocol). *** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND *** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ *** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S *** MAILING LIST. You should probably check the following: - Network connectivity: Ensure that messages can be passed reliably over TCP using random ports. - Environment / PATH settings: Ensure that you are running the same version of LAM/MPI on all nodes. Sometimes premature disconnects (and therefore this error message) may be caused if mismatched versions of LAM are used on different nodes. - Node health: Ensure that the host where the newly-booted process was launched is healthy and still available on the network. ----------------------------------------------------------------------------- n-1<4600> ssi:boot:base:linear: aborted! n-1<4606> ssi:boot:base:linear: booting n0 (192.168.33.202) n-1<4606> ssi:boot:base:linear: booting n1 (192.168.33.203) wxm@192.168.33.203's password: wxm@192.168.33.203's password: n-1<4606> ssi:boot:base:linear: finished lamboot did NOT complete successfully