On Jul 6, 2005, at 11:07 AM, Sumeet Kapur wrote:
> Some Success ! code works on single processor..but lamboot failure on
> remote machine and a couple of Questions:
> Yes Tim, the source code names were .C, I changed it .c
>
> 2. Another problem is running the lamboot on mutiple machines.
> I have put the error output of lamboot -d lamhosts in the
> attached file
> error-lamboot
>
Just out of curiosity, did you read the error message from lamboot?
It looks like the problem is that the second machine (skinner2) was
unable to open a TCP connection back to the first host (krusty2).
This is usually because there is some type of firewall software
running on the hosts that is preventing TCP connections. LAM uses
random TCP and UDP ports for communication, so you have to allow TCP
and UDP connections from the full range of ports.
We see this a lot with Linux these days, as the stock installers for
most distributions turn on a fairly hefty firewall by default. You
should check with your sysadmins and get them to modify the firewall
to not be so protective between machines in you are trying to use
with LAM.
Hope this helps,
Brian
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
|