LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Vishal Sahay (vsahay_at_[hidden])
Date: 2004-05-28 22:46:15


It seems that LAM has been installed with enable-shared option. You dont
seem to have your LD_LIBRARY_PATH set properly on all the nodes.
Specifically you should have it pointing to your prefix/lib (where prefix
is the dir where LAM is installed) on all the nodes. Best way would be to
have it in your dot files.

Hope this helps...

-Vishal

On Thu, 27 May 2004, Pushkar Pradhan wrote:

# I'm unable to boot my nodes with lam 7.0.5 (recently installed). This is my
# boot command in the PBS script:
# lamboot -v -ssi boot rsh -ssi rsh_agent "rsh" $PBS_NODEFILE
#
# And below are the errors in the error file, the output file doesn't contain
# the message "topology done" which I guess is printed if it's successful.
#
# n-1<1718> ssi:boot:base:linear: booting n0 (Empire-09-14)
# n-1<1718> ssi:boot:base:linear: booting n1 (Empire-09-02)
# ERROR: LAM/MPI unexpectedly received the following on stderr:
# hboot: error while loading shared libraries: liblam.so.0: cannot open shared
# object file: No suc
# h file or directory
# ----------------------------------------------------------------------------
# -
# LAM attempted to execute a process on the remote node "Empire-09-02",
# but received some output on the standard error.
#
# LAM tried to use the remote agent command "rsh"
# to invoke "hboot" on the remote node.
#
# This can indicate an authentication error with the remote agent, or
# can indicate an error in your $HOME/.cshrc, $HOME/.login, or
# $HOME/.profile files. The following is a list of items that you may
# wish to check on the remote node:
# .......
# .......
#
# I tried pasting the rsh command and this is the result:
# Redstone[1153] pushkar$ rsh Empire-09-02 -n hboot -t -c
# lam-conf.lamd -v -sessionsuffix pbs-59687.Empire -s -I "-H 172.16.9.14 -P
# 32837 -n 1 -o 0"
# poll: protocol failure in circuit setup
#
# I made sure all the libs and binaries are in my path.
# Can anyone tell what's wrong? Thanks,
# Pushkar
#
# _______________________________________________
# This list is archived at http://www.lam-mpi.org/MailArchives/lam/
#