hello
I had checked the lamboot on the 204 node it is executing without any errors.And the file is present in that machine. But the problem persists.Can you tell me is any thing must be configured in .bashrc of user in node 204 ? Is so can you suggest me how i should do it ..pleas help me out..
thanking you jix
Jeff Squyres <jsquyres_at_[hidden]> wrote:
(cross posting to the OSCAR users list since a second copy was sent there)
On Tue, 2 Dec 2003, jix kicks wrote:
> I am currently working on mpi programs.I am using oscar-2.3.1 on my
> cluster.I am unable to run mpi program on my cluster through lam I am
> getting errors during lamboot.I am succeding at recon. This the error i
> am getting while running the lam on my machine.
>
> LAM 6.5.8/MPI 2 C++/ROMIO - Indiana University
>
> lamboot: boot schema file: myhosts
> lamboot: opening hostfile myhosts
> lamboot: found the following hosts:
> lamboot: n0 192.168.10.203
> lamboot: n1 192.168.10.204
> lamboot: resolved hosts:
> lamboot: n0 192.168.10.203 --> 192.168.10.203
> lamboot: n1 192.168.10.204 --> 192.168.10.204
> lamboot: found 2 host node(s)
> lamboot: origin node is 0 (192.168.10.203)
> Executing hboot on n0 (192.168.10.203 - 1 CPU)...
> lamboot: attempting to execute "hboot -t -c lam-conf.lam -d -v -I " -H 192.168.1
> 0.203 -P 53453 -n 0 -o 0 ""
> hboot: process schema = "/etc/lam/lam-conf.lam"
> hboot: found /usr/bin/lamd
> hboot: performing tkill
> hboot: tkill
> hboot: booting...
> hboot: fork /usr/bin/lamd
> hboot: attempting to execute
> [1] 15626 lamd -H 192.168.10.203 -P 53453 -n 0 -o 0 -d
> Executing hboot on n1 (192.168.10.204 - 1 CPU)...
> lamboot: attempting to execute "/usr/bin/ssh -x -a 192.168.10.204 -n echo $SHELL
> "
> lamboot: got remote shell /bin/bash
> lamboot: attempting to execute "/usr/bin/ssh -x -a 192.168.10.204 -n hboot -t -c
> lam-conf.lam -d -v -s -I "-H 192.168.10.203 -P 53453 -n 1 -o 0 ""
> base: cannot find process schema (null):
This is the error -- it does not appear to be an ssh error.
I would double check that LAM is installed properly on the .204 node -- it
seems to be complaining that it can't find the process schema file, which
should probably be /etc/lam/lam-conf.lam (it was found properly on the
.203 node).
1. Can you see if this file exists on the .204 node?
2. Can you lamboot locally on the .204 node? (i.e., run "lamboot" with no
arguments, thereby booting on just the .204 node only)
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
---------------------------------
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
|