[nilesh@UG04 ~]$ lamboot -v n1.txt LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University n-1<3877> ssi:boot:base:linear: booting n0 (192.168.41.104) n-1<3877> ssi:boot:base:linear: booting n1 (192.168.41.103) nilesh@192.168.41.103's password: nilesh@192.168.41.103's password: ----------------------------------------------------------------------------- The lamboot agent timed out while waiting for the newly-booted process to call back and indicated that it had successfully booted. As far as LAM could tell, the remote process started properly, but then never called back. Possible reasons that this may happen: - There are network filters between the lamboot agent host and the remote host such that communication on random TCP ports is blocked - Network routing from the remote host to the local host isn't properly configured (this is uncommon) You can check these things by watching the output from "lamboot -d". 1. On the command line for hboot, there are two important parameters: one is the IP address of where the lamboot agent was invoked, the other is the port number that the lamboot agent is expecting the newly-booted process to call back on (this will be a random integer). 2. Manually login to the remote machine and try to telnet to the port indicated on the hboot command line. For example, telnet If all goes well, you should get a "Connection refused" error. If you get any other kind of error, it could indicate either of the two conditions above. Consult with your system/network administrator. ----------------------------------------------------------------------------- n-1<3877> ssi:boot:base:linear: aborted! ----------------------------------------------------------------------------- lamboot encountered some error (see above) during the boot process, and will now attempt to kill all nodes that it was previously able to boot (if any). Please wait for LAM to finish; if you interrupt this process, you may have LAM daemons still running on remote nodes. ----------------------------------------------------------------------------- n-1<3883> ssi:boot:base:linear: booting n0 (192.168.41.104) n-1<3883> ssi:boot:base:linear: booting n1 (192.168.41.103) nilesh@192.168.41.103's password: nilesh@192.168.41.103's password: n-1<3883> ssi:boot:base:linear: finished lamboot did NOT complete successfully [nilesh@UG04 ~]$