On Tue, 3 Sep 2002, Bill Bruno wrote:
> I just installed lam-mpi from the latest source release.
> I did recon and it works. But lamboot fails:
<SNIP>
> hboot: fork /home/david/nds/Billb/lam/bin/lamd
> [1] 25701 lamd -H 10.1.2.201 -P 33170 -n 0 -o 0 -d
> hboot: attempting to execute
> -----------------------------------------------------------------------------
> lamboot encountered some error (see above) during the boot process,
> and will now attempt to kill all nodes that it was previously able to
> boot (if any).
>
> Please wait for LAM to finish; if you interrupt this process, you may
> have LAM daemons still running on remote nodes.
> -----------------------------------------------------------------------------
Usually when this happens on the local host, it is a problem with the
setup of the machine. The lamd attempts to open a TCP socket to the
lamboot process. When it fails to do this, lamboot times out and gives
the mostly useless error message above (it can't know what happened, so
all it can really say is "something happened).
The usual suspects in this situation are firewalls and packet filtering.
If you have a recent release of RedHat and turned up the security all the
way on the packet filtering config menu during install, you probably will
run into these kinds of problems.
Without knowing more about your hardware and operating environment, that
is all the advice I can offer.
Hope this helps,
Brian
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|