I do believe that we goofed in lamboot with some shells in 7.1.1. Can
you try the latest 7.1.2 beta? (b8)
I have no OSX package for the beta, but if you really need one, I could
make it.
On Nov 2, 2004, at 3:59 PM, Tony Arcieri wrote:
> I'm trying to run LAM MPI on an Xserve cluster running MacOS 10.3.5
> with
> LAM MPI having been installed from a package. I did this successfully
> on
> a cluster a few months ago, but now that we have our actual cluster I'm
> running into problems with lamboot.
>
> I have a lamhosts file containing the IPs of two systems (there's many
> more, but I'm just trying to get it going on two nodes for now). When
> I
> execute lamboot -v lamhosts, I run into the following:
>
> node1:~ ccastro$ lamboot -v lamhosts
>
> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
>
> n-1<1359> ssi:boot:base:linear: booting n0 (10.0.0.1)
> n-1<1359> ssi:boot:base:linear: booting n1 (10.0.0.2)
> ERROR: LAM/MPI unexpectedly received the following on stderr:
> sh: line 1: [: missing `]'
> sh: line 1: hboot: command not found
>
> [...]
>
> LAM tried to use the remote agent command "ssh"
> to invoke the following command:
>
> ssh 10.0.0.2 -n '( ! [ -e ./.profile] || . ./.profile;' hboot
> -t
> -c lam-conf.lamd -v -s -I '"-H 10.0.0.1 -P 50298 -n 1 -o 0"' )
>
> Correct me if I'm wrong, but the single quote ordering appears to be
> off,
> and clearly test does not like the bracket being placed right next to
> the
> filename.
>
> Regardless, .profile as well as /etc/profile are configured so the LAM
> utilities are in the path (although neither seem to be processed by ssh
> for non-login shells) and whatever glue LAM is using to attempting to
> process them is evidently failing. Example:
>
> node1:~ ccastro$ ssh node2 source .profile;hboot
> -----------------------------------------------------------------------
> ------
> The booted program is missing at least one of the -H, -P, or -n
> command line arguments. These arguments are required to tell the
> booted program how to contact the booting agent.
>
> Cannot continue. Sorry.
> -----------------------------------------------------------------------
> ------
>
> Is there any way to alter the method by which lamboot is invoking
> hboot on
> the other hosts without recompiling LAM? The command it is trying to
> execute is clearly both malformatted, at least according to OS X's bash
>
> Tony Arcieri
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|