It's probably a little of both (ssh and LAM issues).
You should probably check out the LAM FAQ in the section "Running LAM/MPI
applications"; in particular, the following questions:
- What directory does my LAM/MPI program run in on the remote nodes?
- How does LAM find binaries that are invoked from mpirun?
- Why doesn't "mpirun -np 4 test" work?
Also, see the mpirun(1) man page which addresses these kinds of issues in
detail (locating files, process environment, current working directory,
etc.).
You may also wish to run the following to ensure that your $PATH is really
what you think it is:
ssh s21 'echo $PATH'
(note the quotes -- they're critical here) This will show you the $PATH
that the remote LAM daemon (and therefore all programs that it starts) is
getting.
Finally, remember that your .tcshrc is only executed when you lamboot.
So if you change the PATH in your .tcshrc after you lamboot, you'll need
to lamboot again to see that change.
On Mon, 4 Aug 2003, Karl Hahn wrote:
> is this a LAM problem or a ssh problem? Can anybody give a
> hint?
>
> I am running LAM with 2 nodes (at the moment, s21 and s21).
> From my .tcshrc:
> setenv LAMRSH "ssh -x"
>
> s22:~/r5>lamnodes
> n0 s22:1:origin,this_node
> n1 s21.xxx.de:1:
>
> When I use 'ssh s21 -x <command>' it seems that .tcshrc
> is executed on the other node before invoking the command
> while .login is not. The program is in my $PATH (in .tcshrc)
> so I don't have to be in the directory to start it.
>
> s22:~/r5>ssh s21 -x prog
> {echo from .tcshrc}
> {Program starting}
>
> When I try to start my program using mpirun the program
> can not be found on node n1 (if I am not in the directory
> where the program is located). The path to my program is
> included in $PATH which should be the same on node n1
> (man ssh tells me something different, but in my test
> above .tcshrc is executed).
>
> s22:~/r5>mpirun -c 2 prog
> mpirun: cannot start prog on n1: No such file or directory
>
> Why doesn't LAM's mpirun use my $PATH? Do I have to use
> mpirun -c 2 /complete/path/to/my/program/which/can/be/very/long?
> Is there a possibility to avoid that?
>
> Bye,
> Charlie
>
>
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|