Hi!
I reinstalled the BLACS library with previous macro settings and it is still not working. But this time iam not getting the invalid commincator error but different ones. I ran the exe files on 4 nodes and here are the errors...
------------------------------------------------------------------------------------- [patri_at_e01 ~]$ mpirun -v -np 4 /home/BLACS/TESTING/EXE/xFbtest_MPI-LINUX-0
1647 /home/BLACS/TESTING/EXE/xFbtest_MPI-LINUX-0 running on n0 (o)
1225 /home/BLACS/TESTING/EXE/xFbtest_MPI-LINUX-0 running on n1
1219 /home/BLACS/TESTING/EXE/xFbtest_MPI-LINUX-0 running on n2
1219 /home/BLACS/TESTING/EXE/xFbtest_MPI-LINUX-0 running on n3
BLACS WARNING 'No need to set message ID range due to MPI communicator.'
from {-1,-1}, pnum=0, Contxt=-1, on line 18 of file 'blacs_set_.c'.
BLACS WARNING 'No need to set message ID range due to MPI communicator.'
from {-1,-1}, pnum=1, Contxt=-1, on line 18 of file 'blacs_set_.c'.
BLACS WARNING 'No need to set message ID range due to MPI communicator.'
from {-1,-1}, pnum=2, Contxt=-1, on line 18 of file 'blacs_set_.c'.
BLACS WARNING 'No need to set message ID range due to MPI communicator.'
from {-1,-1}, pnum=3, Contxt=-1, on line 18 of file 'blacs_set_.c'.
open: No such file or directory
apparent state: unit 11 named bt.dat
lately writing direct unformatted external IO
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
PID 1647 failed on node n0 (10.5.0.1) due to signal 6.
-----------------------------------------------------------------------------
MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Recv()
Rank (1, MPI_COMM_WORLD): - MPI_Allreduce()
Rank (1, MPI_COMM_WORLD): - MPI_Comm_dup()
Rank (1, MPI_COMM_WORLD): - main()
MPI_Recv: process in local group is dead (rank 2, MPI_COMM_WORLD)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (2, MPI_COMM_WORLD): - MPI_Recv()
Rank (2, MPI_COMM_WORLD): - MPI_Allreduce()
Rank (2, MPI_COMM_WORLD): - MPI_Comm_dup()
Rank (2, MPI_COMM_WORLD): - main()
MPI_Recv: process in local group is dead (rank 3, MPI_COMM_WORLD)
Rank (3, MPI_COMM_WORLD): Call stack within LAM:
Rank (3, MPI_COMM_WORLD): - MPI_Recv()
Rank (3, MPI_COMM_WORLD): - MPI_Allreduce()
Rank (3, MPI_COMM_WORLD): - MPI_Comm_dup()
Rank (3, MPI_COMM_WORLD): - main()
-------------------------------------------------------------------------------------
What i observe is process in all nodes excep rank 0 are dead.
I have attached the Bmake.inc file with this mail.
Thanking You
Regards
Srinivasa Patri
|