Dear Brian & Jeff,
Thank you very much for your support.
I had the same lam source code.. i.e. 7.1.4
What I did is I de-installed all ssh, openssh and LAM from my host. Set the
same paths for each, similar to the other machines while re-installing,
especially while configuring LAM and now I can lamboot successfully from
both the machines.
./configure --with-rsh="usr/bin/ssh -x" was the key-point, I guess!
Thanks,
Yogesh
On Thu, Jun 4, 2009 at 3:53 PM, Brian W. Barrett <brbarret_at_[hidden]>wrote:
> There are a couple of things to check:
>
> 1) Make sure you don't have the environment variable LAM_RSH set, as it
> will override the compiled default
>
> 2) Make sure you have the same installation of LAM on both nodes.
>
> I'm guessing it'll turn out to be one of those two things.
>
> Brian
>
>
> On Thu, 4 Jun 2009, Yogesh Aher wrote:
>
> Thanks again!
>> I checked and I found the difference.
>> From host, one of the output line is =á n-1<28382> ssi:boot:rsh:
>>
>> attempting to execute: rsh 100.120.10.41 -n 'echo $SHELL'
>> Whereas from client, when I do the lamboot, the same line is = n-1<6900>
>> ssi:boot:rsh: attempting to execute: /usr/bin/ssh -x 120.100.10.04 -n
>> 'echo $SHELL'
>>
>> Although, I installed lam-7.1.4, specifying the path to the ssh (option =
>> --with-rsh="/usr/bin/ssh -x")
>>
>> How can I ask host to /usr/bin/ssh and not rsh?
>>
>>
>> On Thu, Jun 4, 2009 at 1:36 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>> You should be able to see the details of exactly what lamboot
>> is doing if you use the "-d" option on the command line.
>>
>>
>>
>> On Jun 4, 2009, at 6:55 AM, Yogesh Aher wrote:
>>
>> Dear Jeff,
>>
>> Thank you very much for pointing out my attention
>> towards this point. Now when I installed lam again with
>> the option pointing out towards the path of ssh
>> (/usr/bin/ssh), I got the same error again. I thought
>> that both the paths are matching on both machines and
>> it should work now, but it isn't! :(
>> For both machines now, the path for mpich/mpicc =
>> /usr/local/bin AND path for ssh = /usr/bin
>>
>> Looking forward for the suggestions again for any other
>> checks I need to do.
>>
>> Thanking you,
>>
>> Sincerely,
>> Y.
>>
>> On Wed, Jun 3, 2009 at 6:06 PM, Jeff Squyres
>> <jsquyres_at_[hidden]> wrote:
>> I'm a little confused -- you mention that ssh is in
>> /usr/bin/ssh, but you configured LAM with
>> --with-rsh=/bin/ssh, not --with-rsh=/usr/bin/ssh.
>>
>> Is there a reason for the difference?
>>
>> Note that LAM may not be well setup to handle ssh being
>> installed in multiple different locations across
>> different nodes; I honestly don't remember. á:-(
>>
>> IIRC, you can also set the env variable LAMRSH at
>> run-time to change the location of your "rsh" binary
>> (e.g., /usr/bin/ssh vs. /bin/ssh).
>>
>>
>>
>> On Jun 3, 2009, at 11:31 AM, Yogesh Aher wrote:
>>
>> I installed both openssh (openssh-5.2p1), ssh
>> (ssh-2.4.0) as a user as well as root with
>> "prefix=/bin" also. But it's installing in /usr/bin.
>>
>>
>> On Wed, Jun 3, 2009 at 5:22 PM, Jeff Squyres
>> <jsquyres_at_[hidden]> wrote:
>> It sounds like ssh is not installed on your other node.
>>
>>
>> On Jun 3, 2009, at 11:07 AM, Yogesh Aher wrote:
>>
>> Dear Brian,
>>
>> Thanks for your prompt reply.
>>
>> I gave this command from both (host and client)
>> machines, but both give the same message:
>>
>> -bash: /bin/ssh: No such file or directory
>>
>> I installed LAM with the option á./configure
>> --with-rsh="/bin/ssh -x"
>>
>> Also, as I'm thinking to use passwordless-ssh, I
>> couldn't find these .rhosts and .cshrc/.profile files.
>>
>> Any suggestions about it?
>>
>> Cheers,
>> Y.
>>
>>
>> On Wed, Jun 3, 2009 at 4:59 PM, Brian W. Barrett
>> <brbarret_at_[hidden]> wrote:
>> Did you try to follow any of the suggestions in the
>> error message you cut-n-paste into your e-mail to the
>> list? áIn particular, does the command:
>>
>>
>> á/bin/ssh -x 100.120.10.41 -n 'echo $SHELL'
>>
>>
>> work properly?
>>
>>
>> Brian
>>
>>
>>
>> On Wed, 3 Jun 2009, Yogesh Aher wrote:
>>
>> Dear LAM-users,
>>
>> I stuck again with the working of LAM for charm++. I
>> installed ssh,
>> openssh, libaio and other necessary libraries (as
>> suggested in earlier
>> archives) again, but still get the following error. If
>> anybody came
>> across such error, will you please let me know about
>> how to resolve it.
>> Also, please let me know if there are any permission
>> changes need to be
>> done?
>>
>> [sam_at_xyz Linux-i686-MPI]$ lamboot -v
>> /home/sam/.nodelist
>>
>> LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
>>
>> n-1<962> ssi:boot:base:linear: booting n0
>> (100.120.10.04)
>> n-1<962> ssi:boot:base:linear: booting n1
>> (100.120.10.41)
>>
>> -----------------------------------------------------------------------------
>> LAM failed to execute a process on the remote node
>> "100.120.10.41".
>> LAM was not trying to invoke any LAM-specific commands
>> yet -- we were
>> simply trying to determine what shell was being used on
>> the remote
>> host.
>>
>> LAM tried to use the remote agent command "/bin/ssh"
>> to invoke "echo $SHELL" on the remote node.
>>
>> *** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS
>> SUGGESTIONS, AND
>> *** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI
>> FAQ
>> *** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE
>> LAM/MPI USER'S
>> *** MAILING LIST.
>>
>> This usually indicates an authentication problem with
>> the remote
>> agent, some other configuration type of error in your
>> .cshrc or
>> .profile file, or you were unable to executable a
>> command on the
>> remote node for some other reason. áThe following is a
>> list of items
>> that you should check on the remote node:
>>
>> á á á- You have an account and can login to the remote
>> machine
>> á á á- Incorrect permissions on your home directory
>> (should
>> á á á áprobably be 0755)
>> á á á- Incorrect permissions on your $HOME/.rhosts file
>> (if you are
>> á á á áusing rsh -- they should probably be 0644)
>> á á á- You have an entry in the remote $HOME/.rhosts
>> file (if you
>> á á á áare using rsh) for the machine and username that
>> you are
>> á á á árunning from
>> á á á- Your .cshrc/.profile must not print anything out
>> to the
>> á á á ástandard error
>> á á á- Your .cshrc/.profile should set a correct TERM
>> type
>> á á á- Your .cshrc/.profile should set the SHELL
>> environment
>> á á á ávariable to your default shell
>>
>> Try invoking the following command at the unix command
>> line:
>>
>> á á á/bin/ssh -x 100.120.10.41 -n 'echo $SHELL'
>>
>>
>> You will need to configure your local setup such that
>> you will *not*
>> be prompted for a password to invoke this command on
>> the remote node.
>> No output should be printed from the remote node before
>> the output of
>> the command is displayed.
>>
>> When you can get this command to execute successfully
>> by hand, LAM
>> will probably be able to function properly.
>>
>> -----------------------------------------------------------------------------
>> n-1<962> ssi:boot:base:linear: Failed to boot n1
>> (100.120.10.41)
>> n-1<962> ssi:boot:base:linear: aborted!
>> n-1<967> ssi:boot:base:linear: booting n0
>> (100.120.10.04)
>> n-1<967> ssi:boot:base:linear: booting n1
>> (100.120.10.41)
>>
>> -----------------------------------------------------------------------------
>> .
>> .
>> .
>>
>> When you can get this command to execute successfully
>> by hand, LAM
>> will probably be able to function properly.
>>
>> -----------------------------------------------------------------------------
>> n-1<967> ssi:boot:base:linear: Failed to boot n1
>> (100.120.10.41)
>> n-1<967> ssi:boot:base:linear: aborted!
>> lamboot did NOT complete successfully
>>
>>
>> Thanking you in advance!
>>
>> Sincerely,
>> Yogesh
>>
>>
>> --
>> áBrian Barrett
>> áLAM/MPI Developer
>> áMake today a LAM/MPI day!
>> _______________________________________________
>> This list is archived at
>> http://www.lam-mpi.org/MailArchives/lam/
>>
>> _______________________________________________
>> This list is archived at
>> http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>> _______________________________________________
>> This list is archived at
>> http://www.lam-mpi.org/MailArchives/lam/
>>
>> _______________________________________________
>> This list is archived at
>> http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> This list is archived at
>> http://www.lam-mpi.org/MailArchives/lam/
>>
>> _______________________________________________
>> This list is archived at
>> http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>>
>>
>>
> --
> Brian Barrett
> LAM/MPI Developer
> Make today a LAM/MPI day!
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|