LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Simon Prunet (prunet_at_[hidden])
Date: 2006-03-13 13:35:01


Dear Jeff,

indeed the ipcs command showed lots of claimed resources.
lamclean or lamhalt did not solve the problem, but iprm did.

Thanks a lot !

Simon

Jeff Squyres wrote:

>I'm guessing that you have run out of SYSV semaphores. These are OS-
>managed resources that can unfortunately persist even after a process
>dies. For example, if you have an MPI process that is using sysv (or
>usysv) that dies badly, it can orphan SYSV resources in the OS. If
>this happens a few times, the OS may run out of SYSV resources and
>you won't be able to run any new sysv/usysv processes.
>
>The lamclean command (and lamhalt) should release all of them, and
>you should be able to run again. If that doesn't work, run the
>"ipcs" command and see if there are excessive resources being
>claimed; the "iprm" command should be able to remove them.
>
>
>
>On Mar 12, 2006, at 12:45 PM, Simon Prunet wrote:
>
>
>
>>Hello all,
>>
>>I used (successfully) for some time the lam suite
>>version 7.1.1 on an 4-way SMP 64bit linux box.
>>
>>All of a sudden, mpirun, even on very simple codes,
>>stopped running on more than one processor, with
>>the following error:
>>
>>----------------------------------------------------------------------
>>-------
>>The selected RPI failed to initialize during MPI_INIT. This is a
>>fatal error; I must abort.
>>
>>This occurred on host node-01 (n0).
>>The PID of failed process was 27008 (MPI_COMM_WORLD rank: 0)
>>----------------------------------------------------------------------
>>-------
>>
>>Going through past posts, there were hints that it was a problem
>>related with the rpi used by default, and indeed this problem
>>disappears
>>when I used the tcp rpi, and only appears with the sysv or usysv
>>rpi's...
>>
>>Any idea ?
>>
>>Thanks for your help,
>>
>>Simon
>>_______________________________________________
>>This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>
>
>
>