It seems that you have no path whatsoever. Right now, hboot will
complain about this (i.e., exactly the error that you are seeing).
I'll update hboot to not make this an error, but rather handle this
situation properly. This will be available in tomorrow's nightly
tarball (I'll put it both on the trunk and the upcoming 7.1.2 release,
but won't be cutting a new 7.1.2 beta tarball).
As an alternate workaround, you might want to see how to setup
globus-job-run so that it sets a PATH for the launched job.
On Jun 7, 2005, at 6:29 PM, Swan wrote:
> Hi Jeff,
>
> I had executed /bin/env using globus-job-run as you suggested,
> it doesn't had any PATH environment variable.
>
> [vasptest_at_orlon31 testing]$ globus-job-run 127.0.0.1 env
> GRAM Job failed because the executable does not exist (error code 5)
> [vasptest_at_orlon31 testing]$ globus-job-run 127.0.0.1 /bin/env
> HOME=/home/vasptest
> LOGNAME=vasptest
> GLOBUS_GRAM_JOB_CONTACT=https://orlon31.itsc.cuhk.edu.hk:34241/6379/
> 1118193672/
> GLOBUS_LOCATION=/usr/local/gt321
> X509_USER_PROXY=/home/vasptest/.globus/job/orlon31.itsc.cuhk.edu.hk/
> 6379.1118193672/x509_up
> GLOBUS_GRAM_MYJOB_CONTACT=URLx-nexus://orlon31.itsc.cuhk.edu.hk:34242/
> What should I do in order to make it works properly?
> I am glad that I could hear your reply and looking for your future
> replies.
>
> Regards,
> Swan, HPC team, Chinese University of Hong Kong
>> ----- Original Message -----
>> From: Jeff Squyres
>> To: General LAM/MPI mailing list
>> Sent: 2005$BG/(B6$B7n(B8$BF|(B $B>e8a(B 04:55
>> Subject: Re: LAM: lamboot on globus
>>
>> What it looks like is happening is that hboot (an internal LAM
>> command)
>> is failing to find the $PATH environment variable -- which seems
>> pretty
>> odd. When you globus-job-run a command, do you get no PATH at all?
>> E.g., what happens if you "globus-job-run 127.0.0.1 env"?
>>
>>
>> On Jun 6, 2005, at 10:46 PM, Lai Swan wrote:
>>
>> > Dear All,
>> >
>> > I am trying to run lamboot and occurred the following error,
>> >
>> > [vasptest_at_orlon31 testing]$ lamboot -v -ssi boot globus hosts
>> > LAM 7.1.1/MPI 2 C++ - Indiana University
>> > n-1<23931> ssi:boot:base:linear: booting n0 (127.0.0.1)
>> > ERROR: LAM/MPI unexpectedly received the following on stderr:
>> >
>> ----------------------------------------------------------------------
>> -
>> > ------
>> >
>> > LAM encountered an error when invoking the library call "getenv".
>> > This is an unexpected error; we don't have much additional
>> information
>> > here. Perhaps this Unix error message will help:
>> > Unix errno: 1268
>> > Unknown error 1268
>> >
>> ----------------------------------------------------------------------
>> -
>> > ------
>> >
>> >
>> ----------------------------------------------------------------------
>> -
>> > ------
>> >
>> > LAM failed to execute a LAM binary on the remote node "127.0.0.1".
>> > LAM attempted to execute a process on the remote node "127.0.0.1",
>> > but received some output on the standard error.
>> > LAM tried to use the command "/usr/local/gt321/bin/globus-job-run"
>> to
>> > invoke the following command:
>> > /usr/local/gt321/bin/globus-job-run 127.0.0.1
>> > /usr/local/lam-7.1.1/bin/hboot -t -c
>> > /usr/local/lam-7.1.1/etc/lam-conf.lamd -v -I "-H 127.0.0.1 -P 45587
>> -n
>> > 0 -o 0" -prefix /usr/local/lam-7.1.1
>> > The problem may be because:
>> > - The Globus GRAM client returned some output on the stderr
>> > - You have not done 'grid-proxy-init'. You need to do that
>> before
>> > LAM can boot as it uses globus-job-run to start the LAM
>> daemons.
>> > - LAM is not able to find binaries in the 'prefix' path you
>> > specified in the boot hostfile. Check the path, it should
>> point
>> > to
>> > the directory where LAM/MPI is installed on this host.
>> > Try to invoke the command listed above manually at a Unix prompt.
>> > When you can get this command to execute successfully by hand, LAM
>> > will probably be able to function properly.
>> >
>> ----------------------------------------------------------------------
>> -
>> > ------
>> >
>> > n-1<23931> ssi:boot:base:linear: Failed to boot n0 (127.0.0.1)
>> > n-1<23931> ssi:boot:base:linear: aborted!
>> > lamboot did NOT complete successfully
>> >
>> > What should I do to solve it?
>> > I would be very grateful if I could hear your reply!!
>> >
>> > Regards,
>> > Swan, HPC Team, Chinese University of Hong Kong
>> >
>> >
>> > _______________________________________________
>> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>> >
>>
>> --
>> {+} Jeff Squyres
>> {+} jsquyres_at_[hidden]
>> {+} http://www.lam-mpi.org/
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|