LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: RENE ALEXANDER BARRERA (renea_barrera_at_[hidden])
Date: 2006-01-20 10:29:07


Hello all,
I am new to LAM and cluster computing. I have joined two machines.
The Operative System is Red Hat 9.
Version LAM 7.1.1
User LAM on each machine with shared LAMHOME.
First I created user accounts with same name on both the machines. Username
LAM.
The shared LAMHOME is SAMBA with authentication. I add the next line in the
file /etc/fstab for mount the LAMHOME:
//192.168.45.1/lam /home/lam smbfs
username=lam,passwd=lam,user,exec,fmask=644,dmask=755 0 0

The samba server is the host 192.168.45.1
I add user lam in the list of samba users and the password is lam.
I edit the file /etc/samba/smb.conf for add the service:

[global]
workgroup = WORKGROUP
hosts allow = 192.168.45. 127.
encrypt passwords = yes
smb passwd file = /etc/samba/smbpasswd

[lam]
comment = lam
path = /home/lam
valid users = lam
public = no
writable = yes
printable = no

Then I installed LAM-7.1.1 in the server machine (192.168.45.1):

cd $HOME
gunzip -c lam-7.0.6.tar.gz | tar xf -
rm lam-7.0.6.tar.gz
cd lam-7.0.6
./configure --prefix=/home/lam --without-fc
make
make install
make lamexamples

And I modified /home/lam/.bashrc for added $HOME/bin to the PATH:

PATH=$HOME/bin:$PATH
MANPATH=$MANPATH:$HOME/man
LAMHOME=/home/lam
LAMRSH="ssh -x -i /etc/lam.key"
export PATH MANPATH LAMHOME LAMRSH
export LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH
export C_INCLUDE_PATH=$HOME/include:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=$HOME/include:CPLUS_INCLUDE_PATH
Then I generated the password ssh:

ssh-keygen -t rsa -f lam.key
cat lam.key.pub >> $HOME/.ssh/authorized_keys2
chmod 600 $HOME/.ssh/authorized_keys2
rm lam.key.pub
mv lam.key /etc/lam.key

ssh -i /etc/lam.key lam_at_localhost.localdomain

And I could get this command to execute successfully.

Then I edited the file /home/lam/etc/lam-bhost.def

nodo1.cluster.edu cpu=2 user=lam
nodo2.cluster.edu cpu=1 user=lam

Is this the correct configuration for LAM?
I think of this is the correct configuration for LAM
but when I execute the command recon here is the error message reported by
LAM.

[lam_at_nodo1 lam]$ recon
ERROR: LAM/MPI unexpectedly received the following on stderr:
bash: line 1: /home/lam/bin/tkill: Permission denied
------------------------------------------------------------------------
LAM failed to execute a LAM binary on the remote node
"lam_at_[hidden]".
Since LAM was already able to determine your remote shell as "tkill",
it is probable that this is not an authentication problem.

*** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND
*** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ
*** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S
*** MAILING LIST.

LAM tried to use the remote agent command "ssh"
to invoke the following command:

        ssh -x -i /etc/lam.key nodo2.cluster.edu -n -l lam tkill -N

This can indicate several things. You should check the following:

  - The LAM binaries are in your $PATH
  - You can run the LAM binaries
  - The $PATH variable is set properly before your .cshrc/.profile exits

Try to invoke the command listed above manually at a Unix prompt.

You will need to configure your local setup such that you will *not*
be prompted for a password to invoke this command on the remote node.
No output should be printed from the remote node before the output of
the command is displayed.

When you can get this command to execute successfully by hand, LAM
will probably be able to function properly.
------------------------------------------------------------------------

Please help me!
Thank you very much!

_________________________________________________________________
MSN Amor: busca tu ½ naranja http://latam.msn.com/amor/