We could ssh OK. But did find pfilter.conf problems. Made some changes and
restarted pfilter. OK now.
Thanks VERY much!!
Dennis
Dennis Gurgul
Massachusetts General Hospital
Research Management
617.724.3169
dgurgul_at_[hidden]
-----Original Message-----
From: Jeff Squyres [mailto:jsquyres_at_[hidden]]
Sent: Tuesday, July 08, 2003 3:45 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: lamboot fails on some nodes
On Tue, 8 Jul 2003, Gurgul, Dennis J. wrote:
> I have a 5 node cluster with OSCAR 2.2 and Lam 6.5.9. All 4 internal
> nodes are identical. But, while 2 of them will work, the other two will
> [snipped]
> The last line in the output before the error message (lamboot
> encountered some error.....) is:
>
> topology n3...
>
> The error message says "(see above", however, there is nothing to
> indicate what went wrong.
That's fairly odd. :-)
It seems like this might be a networking problem -- the "topology" phase
is where LAM is sending around the connection information.
- Did the firewall software somehow get enabled on any of the nodes?
In OSCAR, pfilter *should* be configured to allow connections from any
port to any port within the cluster.
- Can you ssh between all the nodes properly (without a password)?
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|