LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jaime Perea (jaime_at_[hidden])
Date: 2004-11-26 09:45:51


Hello,

I have a very strange problem with lam 7.1.1 (and also 7.0.6 and 7.2b...)
so probably it has to be something with my cluster setup.

I substituted a few nodes on my cluster, the disks are really direct
copies of the previous ones, so no changing in OS.

I can do lamboot and everything works (I can do lamnodes on all
the nodes), but when I try to send a task with mpirun I get
 
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).

mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------

If I use mpixec -d

mpiexec: Global argument parsing done
mpiexec: Host-Node Number hash created
mpiexec: Temporary file lam_appschema_19VOJ0 created (will be used as app
schema file for mpirun)
mpiexec: Launching MPI programs
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).

mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------
mpiexec: Inside handle_waitpid_status Function: mpirun, Error Status: 64512
mpiexec: mpiexec_die called
mpiexec: deleting temprory file lam_appschema_19VOJ0
mpirun failed with exit status 252

Does anybody knows what is going on here?

Thanks in advance

-- 
           Jaime D. Perea Duarte. <jaime at iaa dot es>
             Linux registered user #10472
           Dep. Astrofisica Extragalactica.
           Instituto de Astrofisica de Andalucia (CSIC)
           Apdo. 3004, 18080 Granada, Spain.