What it looks like is happening is that hboot (an internal LAM command)
is failing to find the $PATH environment variable -- which seems pretty
odd. When you globus-job-run a command, do you get no PATH at all?
E.g., what happens if you "globus-job-run 127.0.0.1 env"?
On Jun 6, 2005, at 10:46 PM, Lai Swan wrote:
> Dear All,
>
> I am trying to run lamboot and occurred the following error,
>
> [vasptest_at_orlon31 testing]$ lamboot -v -ssi boot globus hosts
> LAM 7.1.1/MPI 2 C++ - Indiana University
> n-1<23931> ssi:boot:base:linear: booting n0 (127.0.0.1)
> ERROR: LAM/MPI unexpectedly received the following on stderr:
> -----------------------------------------------------------------------
> ------
>
> LAM encountered an error when invoking the library call "getenv".
> This is an unexpected error; we don't have much additional information
> here. Perhaps this Unix error message will help:
> Unix errno: 1268
> Unknown error 1268
> -----------------------------------------------------------------------
> ------
>
> -----------------------------------------------------------------------
> ------
>
> LAM failed to execute a LAM binary on the remote node "127.0.0.1".
> LAM attempted to execute a process on the remote node "127.0.0.1",
> but received some output on the standard error.
> LAM tried to use the command "/usr/local/gt321/bin/globus-job-run" to
> invoke the following command:
> /usr/local/gt321/bin/globus-job-run 127.0.0.1
> /usr/local/lam-7.1.1/bin/hboot -t -c
> /usr/local/lam-7.1.1/etc/lam-conf.lamd -v -I "-H 127.0.0.1 -P 45587 -n
> 0 -o 0" -prefix /usr/local/lam-7.1.1
> The problem may be because:
> - The Globus GRAM client returned some output on the stderr
> - You have not done 'grid-proxy-init'. You need to do that before
> LAM can boot as it uses globus-job-run to start the LAM daemons.
> - LAM is not able to find binaries in the 'prefix' path you
> specified in the boot hostfile. Check the path, it should point
> to
> the directory where LAM/MPI is installed on this host.
> Try to invoke the command listed above manually at a Unix prompt.
> When you can get this command to execute successfully by hand, LAM
> will probably be able to function properly.
> -----------------------------------------------------------------------
> ------
>
> n-1<23931> ssi:boot:base:linear: Failed to boot n0 (127.0.0.1)
> n-1<23931> ssi:boot:base:linear: aborted!
> lamboot did NOT complete successfully
>
> What should I do to solve it?
> I would be very grateful if I could hear your reply!!
>
> Regards,
> Swan, HPC Team, Chinese University of Hong Kong
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|