Hello -
I appologize for the delay in replying. We are all grad students and are
getting a bit behind at the end of the semester :).
So, the error message you were receiving gives you the short answer -
hboot was unable to find a program it was looking for in your path. The
common solution for this problem is to run:
rsh <hostname> 'echo $path'
and make sure that it matches what you expect. Not all "dot files" are
executed when logging in non-interactively, so this can cause unexpected
behavior.
There is one very odd thing - hboot is looking for "kernel" instead of
"lamd". This is not the default behavior of any recent version of LAM,
and it does not look like what should happen based on the parameters to
hboot. What version of LAM are you using? Are you sure it is properly
installed and the first version of LAM in your path on all machines?
My guess is that there is a borked install of LAM on the remote machine
that is earlier in the path than the version you expect to be running.
Brian
On Thu, 1 May 2003, Rohan Inamdar wrote:
> Hello,
>
> I am not able to boot lam on multiple hosts.
>
> Its really getting bad now. I read through most of the messages and checked
> things like -
>
> 1. Hosts do not resolve to 127.0.0.1
> 2. Write permission to /tmp folder
> 3. /etc/hosts file
> 4. .rhosts file on each node.
>
> Still i am not able to solve my problem mentioned in
> http://www.lam-mpi.org/MailArchives/lam/msg05789.php
>
> I still get the exact same error.
>
> Can someone help me.
>
> Rohan
>
> www.iit.edu/~inamroh
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
|