LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Dilani Perera (dilani_at_[hidden])
Date: 2005-12-09 10:34:55


Hi,

I have written a mpi program to solve larse sparse systems of linear
eaquations.

In the program root processor sends data to other processes, when the
calculation is done they send back the results to the root processor back.
To send and receive I am using MPI_Send command and MPI_Recv command.
This sending and receiving process take place until a certailn condition
is satisfied.

Program works fine but it seems that program terminate due to some failure.

In side the program there are several places for memory allocation. but
the memory deallocation also done after that. When size of the array is
larger seems like program not working at all.

% mpicc -lm -o out Main.c
% mpirun -v -np 2 out A1500.txt
2822 out running on n0 (o)
22237 out running on n1

MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Recv()
Rank (1, MPI_COMM_WORLD): - main()
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error. If
your process did not finish in error, be sure to include a "return 0" or
"exit(0)" in your C code before exiting the application.

PID 9940 failed on node n0 (134.153.50.235) due to signal 11.
-----------------------------------------------------------------------------
%

what can be the problem ??

Thanks .....

Dilani Perera.
(MSC Candidate for Computational Sciences)
Department of Computer Science,
St. John's, NL
Canada,A1B 3X5
Tel: 709-737-6142 (office)

email : dilani_at_[hidden]
Visit me at : www.cs.mun.ca/~dilani