It's clear that you have a bug in your code in one or more process started in a nodes, this bug force the program to exit. If one process started in different node exits it generates this error, i have several experiences with the same problem. I advice you to check your code again, exemple of commun bugs: segmentation fault mean that you reserve a part of memory and your program checks outside this part which cause the bug (the bug apears only in execution with certain values, not in compilation process.
Hope this would be helpful.
----- Original Message -----
From: RANGI, JAI
To: 'lam_at_[hidden]'
Sent: Wednesday, March 31, 2004 3:22 PM
Subject: LAM: MPIRUN error
I got this error while doing the Matrix Multiplication for two matrixes of size 95x95.
I don't get any error if the matrix is say 95x55 and 55x95 or smaller than this. I am running lam-7.0-67 version of Lam. And the cluster is made of 64-bit optron processors with Suse 64-bit Operating system.
I never had any problem with lam-6.5.4-1dyn version of lam on a different cluster built out of Pentium 2 machines. There I am able to do the matrix multiplication of up to 500x500.
MPI_Send: process in local group is dead (rank 0, MPI_COMM_WORLD) Rank (0, MPI_COMM_WORLD): Call stack within LAM: Rank (0, MPI_COMM_WORLD): - MPI_Send() Rank (0, MPI_COMM_WORLD): - main()
----------------------------------------------------------------------------
-
One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application.
PID 14236 failed on node n12 (192.168.1.113) with exit status 1.
----------------------------------------------------------------------------
Any hint will be appreciated
Thanks
Jai Rangi
Unix System Administrator, Computing Services,
South Dakota State University
Brookings SD 57006.
email: jai_rangi_at_[hidden]
Ph: 605 688 4689
Fax: 6056884605
-------------------------------------------------------
In the world with no fences, why would you need Gates ?
- Linux
-------------------------------------------------------
------------------------------------------------------------------------------
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|