LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: André Kempe (andre_at_[hidden])
Date: 2003-05-04 12:03:10


As I understand it you want to insert your own signal-handler before
LAM/MPI's handlers are called.

We had the same problem to, and we solved it by creating a c-vector of
function-pointers. Since signal( int signum , signal_handler) returns a
pointer to the old signal_handler,
we store the old handler in the c-vector, so our custom-signal-handler
is able to call LAM/MPI's handler in return.

Be shure to start the procedure of replacing the sig-handlers after
MPI_Init( ... ), since this sets MPI's sig-handlers.

Andre

Henrik R. Nagel wrote:

>Hi,
>
>I have made a software framework based on LAM/MPI (6.5.9). The software is
>used by several people on Linux PC's and on an SGI Onyx2. Recently, we
>have had problems with the Onyx2. When running programs, we received error
>message like this:
>
>
>
>>Avocado Fatal: pfMemory::new() Unable to allocate 236 bytes from arena
>>
>>
>0x60004000.
>
>After some time, we discovered that the problem was the reservation of
>semaphores and shared memory keys by LAM/MPI, since the rest of the
>software framework does not use this.
>
>Each application, created with this software framework, consists of
>multiple executable files. In case of a program crash, it appears as
>though one of the processes exits without releasing allocated memory.
>
>I have tried to compensate for this, by adding a "signal-handler" to all
>process, which uses "psignal" to print an error message, and the
>"MPI_Abort" command to stop LAM/MPI. This, however, results in some of the
>processes exiting without calling the signal-handler.
>
>I have therefore compensated for this unexpected behaviour by using
>"MPI_Finalize" and "exit" instead of "MPI_Abort" in the signal-handler.
>This results in only the process that crashes calling the signal-handler.
>However, by pressing Ctrl-C, the rest of the processes also calls the
>signal-handler.
>
>What is the correct way to deal with this?
>
>
>Best regards,
>
>Henrik Nagel
>
>
>