[clususer@vlsiserver ~]$ lamboot -v n3.txt LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University n-1<2417> ssi:boot:base:linear: booting n0 (192.168.41.55) n-1<2417> ssi:boot:base:linear: booting n1 (192.168.41.103) ----------------------------------------------------------------------------- The lamboot agent failed to open a client socket to the newly-booted process at IP address 192.168.41.103, port 32817. Although the newly-booted process has already communicated successfully with the lamboot agent over other TCP sockets, this is the first time that the lamboot agent tried to initiate a connection to the newly-booted process. As such, this may indicate: 1. 192.168.41.103 is not the correct IP address for the machine where the newly-booted machine was launched 2. There are network filters between the lamboot agent host and the remote host such that communication on random TCP ports is blocked 3. Network routing from the the local host to the remote isn't properly configured (this is unlikely) For number 1, check to ensure that 192.168.41.103 is the correct IP address for that machine. If it is not, check the host mapping on that machine (e.g., /etc/hosts) to ensure that 192.168.41.103 is both reachable and is the bythe host where the lamboot agent is running, and is the correct host. For numbers 2 and 4, try to telnet to 192.168.41.103, port 32817. You should get a "connection refused" error, which will indicate that you successfully connected to some machine at that IP address, and no process was listening on that port. If you get any other kind of error, check with your system/network administrator -- it may indicate network / routing issues between the two hosts. ----------------------------------------------------------------------------- n-1<2417> ssi:boot:base:linear: aborted! ----------------------------------------------------------------------------- lamboot encountered some error (see above) during the boot process, and will now attempt to kill all nodes that it was previously able to boot (if any). Please wait for LAM to finish; if you interrupt this process, you may have LAM daemons still running on remote nodes. ----------------------------------------------------------------------------- n-1<2423> ssi:boot:base:linear: booting n0 (192.168.41.55) n-1<2423> ssi:boot:base:linear: booting n1 (192.168.41.103) n-1<2423> ssi:boot:base:linear: finished lamboot did NOT complete successfully [clususer@vlsiserver ~]$