Hi all,
I'm trying for so many days to use my own installed dynamic lam. This is
command I use in my pbs script:
lamboot -v -ssi boot rsh $PBS_NODEFILE
I also created a .rhosts file like the docs suggested.
Our admin told me that there was no need to sort | uniq the node file. They
also told me that ssh protocol wasn't supported in rsh mode, yet it seems
this is what lamboot is trying to use.
One more thing, it seems the docs ask you to use "tm boot module" but my
admin specifically asked me not to use it because it causes some problems in
PBS.
Can you please look at the errors generated and tell me what should I do to
avoid this error:
n0<8632> ssi:boot:base:linear: booting n0 (Empire-05-15)
n0<8632> ssi:boot:base:linear: booting n1 (Empire-05-14)
ERROR: LAM/MPI unexpectedly received the following on stderr:
ssh: connect to host Empire-05-14 port 22: Connection refused
----------------------------------------------------------------------------
-
LAM failed to execute a process on the remote node "Empire-05-14".
LAM was not trying to invoke any LAM-specific commands yet -- we were
simply trying to determine what shell was being used on the remote
host.
LAM tried to use the remote agent command "ssh"
to invoke "echo $SHELL" on the remote node.
This usually indicates an authentication problem with the remote
agent, or some other configuration type of error in your .cshrc or
.profile file. The following is a list of items that you may wish to
check on the remote node:
- You have an account and can login to the remote machine
- Incorrect permissions on your home directory (should
probably be 0755)
- Incorrect permissions on your $HOME/.rhosts file (if you are
using rsh -- they should probably be 0644)
- You have an entry in the remote $HOME/.rhosts file (if you
are using rsh) for the machine and username that you are
running from
- Your .cshrc/.profile must not print anything out to the
standard error
- Your .cshrc/.profile should set a correct TERM type
- Your .cshrc/.profile should set the SHELL environment
variable to your default shell
....<snipped>
Thanks,
Pushkar
|