Hi,
I have installed lam7.0.6 on linux. Everything compiles, and the examples run.
My program runs as well. However, sometimes, the program stops while running, after hundreds of iterations, and gives the error message printed below. For this example the program runs on only one node (n0).
Any ideas why would that happen?
Thank you!
Marcelo
________
MPI_Recv: process in local group is dead (rank 0, MPI_COMM_WORLD)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Recv()
Rank (0, MPI_COMM_WORLD): - main()
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
PID 14093 failed on node n0 (127.0.0.1) with exit status 1.
-----------------------------------------------------------------------------
--
|