LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2007-09-15 18:25:23


On Sep 11, 2007, at 9:29 AM, Miguel Ángel González Gisbert wrote:

> I am developping a paralell program using MPI. I am working with a
> cluster of 16 biprocessor nodes. All seemed to be fine, but when I try
> to test my program (now I am trying with a simple "Hello world!"
> program to find the problem) using two processes by node (adding
> "cpu=2" in the configuration file) I get the following error:
>
> magonzalez_at_baobab:~/StageVerano/HelloWorld$ mpirun -np 6 ./hello
> ----------------------------------------------------------------------
> -------
> The selected RPI failed to initialize during MPI_INIT. This is a
> fatal error; I must abort.
>
> This occurred on host n2 (n1).
> The PID of failed process was 12851 (MPI_COMM_WORLD rank: 2)
> ----------------------------------------------------------------------
> -------
> ----------------------------------------------------------------------
> -------
> One of the processes started by mpirun has exited with a nonzero exit
> code. This typically indicates that the process finished in error.
> If your process did not finish in error, be sure to include a "return
> 0" or "exit(0)" in your C code before exiting the application.
>
> PID 29102 failed on node n0 (10.0.0.1) with exit status 1.
> ----------------------------------------------------------------------
> -------

Can you try running with the mpirun parameter "-ssi rpi tcp" and see
if that works? It's hard to tell from that error message whether the
issue is with the TCP or shared memory portion of the transport engine.

Thanks,

Brian

-- 
   Brian Barrett
   LAM/MPI Developer
   Make today a LAM/MPI day!