You should be able to see the details of exactly what lamboot is doing
if you use the "-d" option on the command line.
On Jun 4, 2009, at 6:55 AM, Yogesh Aher wrote:
> Dear Jeff,
>
> Thank you very much for pointing out my attention towards this
> point. Now when I installed lam again with the option pointing out
> towards the path of ssh (/usr/bin/ssh), I got the same error again.
> I thought that both the paths are matching on both machines and it
> should work now, but it isn't! :(
> For both machines now, the path for mpich/mpicc = /usr/local/bin AND
> path for ssh = /usr/bin
>
> Looking forward for the suggestions again for any other checks I
> need to do.
>
> Thanking you,
>
> Sincerely,
> Y.
>
> On Wed, Jun 3, 2009 at 6:06 PM, Jeff Squyres <jsquyres_at_[hidden]>
> wrote:
> I'm a little confused -- you mention that ssh is in /usr/bin/ssh,
> but you configured LAM with --with-rsh=/bin/ssh, not --with-rsh=/usr/
> bin/ssh.
>
> Is there a reason for the difference?
>
> Note that LAM may not be well setup to handle ssh being installed in
> multiple different locations across different nodes; I honestly
> don't remember. :-(
>
> IIRC, you can also set the env variable LAMRSH at run-time to change
> the location of your "rsh" binary (e.g., /usr/bin/ssh vs. /bin/ssh).
>
>
>
> On Jun 3, 2009, at 11:31 AM, Yogesh Aher wrote:
>
> I installed both openssh (openssh-5.2p1), ssh (ssh-2.4.0) as a user
> as well as root with "prefix=/bin" also. But it's installing in /usr/
> bin.
>
>
> On Wed, Jun 3, 2009 at 5:22 PM, Jeff Squyres <jsquyres_at_[hidden]>
> wrote:
> It sounds like ssh is not installed on your other node.
>
>
> On Jun 3, 2009, at 11:07 AM, Yogesh Aher wrote:
>
> Dear Brian,
>
> Thanks for your prompt reply.
>
> I gave this command from both (host and client) machines, but both
> give the same message:
>
> -bash: /bin/ssh: No such file or directory
>
> I installed LAM with the option ./configure --with-rsh="/bin/ssh -x"
>
> Also, as I'm thinking to use passwordless-ssh, I couldn't find
> these .rhosts and .cshrc/.profile files.
>
> Any suggestions about it?
>
> Cheers,
> Y.
>
>
> On Wed, Jun 3, 2009 at 4:59 PM, Brian W. Barrett <brbarret_at_lam-
> mpi.org> wrote:
> Did you try to follow any of the suggestions in the error message
> you cut-n-paste into your e-mail to the list? In particular, does
> the command:
>
>
> /bin/ssh -x 100.120.10.41 -n 'echo $SHELL'
>
> work properly?
>
>
> Brian
>
>
>
> On Wed, 3 Jun 2009, Yogesh Aher wrote:
>
> Dear LAM-users,
>
> I stuck again with the working of LAM for charm++. I installed ssh,
> openssh, libaio and other necessary libraries (as suggested in earlier
> archives) again, but still get the following error. If anybody came
> across such error, will you please let me know about how to resolve
> it.
> Also, please let me know if there are any permission changes need to
> be
> done?
>
> [sam_at_xyz Linux-i686-MPI]$ lamboot -v /home/sam/.nodelist
>
> LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
>
> n-1<962> ssi:boot:base:linear: booting n0 (100.120.10.04)
> n-1<962> ssi:boot:base:linear: booting n1 (100.120.10.41)
> -----------------------------------------------------------------------------
> LAM failed to execute a process on the remote node "100.120.10.41".
> LAM was not trying to invoke any LAM-specific commands yet -- we were
> simply trying to determine what shell was being used on the remote
> host.
>
> LAM tried to use the remote agent command "/bin/ssh"
> to invoke "echo $SHELL" on the remote node.
>
> *** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND
> *** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ
> *** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S
> *** MAILING LIST.
>
> This usually indicates an authentication problem with the remote
> agent, some other configuration type of error in your .cshrc or
> .profile file, or you were unable to executable a command on the
> remote node for some other reason. The following is a list of items
> that you should check on the remote node:
>
> - You have an account and can login to the remote machine
> - Incorrect permissions on your home directory (should
> probably be 0755)
> - Incorrect permissions on your $HOME/.rhosts file (if you are
> using rsh -- they should probably be 0644)
> - You have an entry in the remote $HOME/.rhosts file (if you
> are using rsh) for the machine and username that you are
> running from
> - Your .cshrc/.profile must not print anything out to the
> standard error
> - Your .cshrc/.profile should set a correct TERM type
> - Your .cshrc/.profile should set the SHELL environment
> variable to your default shell
>
> Try invoking the following command at the unix command line:
>
> /bin/ssh -x 100.120.10.41 -n 'echo $SHELL'
>
> You will need to configure your local setup such that you will *not*
> be prompted for a password to invoke this command on the remote node.
> No output should be printed from the remote node before the output of
> the command is displayed.
>
> When you can get this command to execute successfully by hand, LAM
> will probably be able to function properly.
> -----------------------------------------------------------------------------
> n-1<962> ssi:boot:base:linear: Failed to boot n1 (100.120.10.41)
> n-1<962> ssi:boot:base:linear: aborted!
> n-1<967> ssi:boot:base:linear: booting n0 (100.120.10.04)
> n-1<967> ssi:boot:base:linear: booting n1 (100.120.10.41)
> -----------------------------------------------------------------------------
> .
> .
> .
>
> When you can get this command to execute successfully by hand, LAM
> will probably be able to function properly.
> -----------------------------------------------------------------------------
> n-1<967> ssi:boot:base:linear: Failed to boot n1 (100.120.10.41)
> n-1<967> ssi:boot:base:linear: aborted!
> lamboot did NOT complete successfully
>
>
> Thanking you in advance!
>
> Sincerely,
> Yogesh
>
>
> --
> Brian Barrett
> LAM/MPI Developer
> Make today a LAM/MPI day!
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
Jeff Squyres
Cisco Systems
|