It looks like exactly what the error is saying -- that [at least] one
of the processes launched by mpirun did not invoke MPI_INIT before
exiting. LAM defines this to be a terminal error.
You should double check and see if there are any fatal errors before
MPI_INIT is invoked, and ensure that your application doesn't have some
other application of the same name that can be found ahead of it in the
PATH (in which case, LAM will find that one first and execute it, not
knowing that it's not an MPI application).
On Nov 26, 2004, at 9:45 AM, Jaime Perea wrote:
> Hello,
>
> I have a very strange problem with lam 7.1.1 (and also 7.0.6 and
> 7.2b...)
> so probably it has to be something with my cluster setup.
>
> I substituted a few nodes on my cluster, the disks are really direct
> copies of the previous ones, so no changing in OS.
>
> I can do lamboot and everything works (I can do lamnodes on all
> the nodes), but when I try to send a task with mpirun I get
>
> -----------------------------------------------------------------------
> ------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> -----------------------------------------------------------------------
> ------
>
> If I use mpixec -d
>
> mpiexec: Global argument parsing done
> mpiexec: Host-Node Number hash created
> mpiexec: Temporary file lam_appschema_19VOJ0 created (will be used as
> app
> schema file for mpirun)
> mpiexec: Launching MPI programs
> -----------------------------------------------------------------------
> ------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> -----------------------------------------------------------------------
> ------
> mpiexec: Inside handle_waitpid_status Function: mpirun, Error Status:
> 64512
> mpiexec: mpiexec_die called
> mpiexec: deleting temprory file lam_appschema_19VOJ0
> mpirun failed with exit status 252
>
> Does anybody knows what is going on here?
>
> Thanks in advance
>
> --
>
> Jaime D. Perea Duarte. <jaime at iaa dot es>
> Linux registered user #10472
>
> Dep. Astrofisica Extragalactica.
> Instituto de Astrofisica de Andalucia (CSIC)
> Apdo. 3004, 18080 Granada, Spain.
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|