Hello everyone. This is my first official submission of an error from a Linux
cluster that we're running. Any ideas as to what may be causing this would be
awesome!
In gmx381_new_iter9, the error message is
Node 13: error opening file /home/jr241/gmx381_new_iter9/local/dbout.0013
Node 13: error opening file /home/jr241/gmx381_new_iter9/local/d3plot06
Node 12: error opening file /home/jr241/gmx381_new_iter9/local/dbout.0012
Node 4: error opening file /home/jr241/gmx381_new_iter9/local/dbout.0004
In gmx381_new_iter9_3,
the error message is:
Node 2: error opening file /home/jr241/gmx381_new_iter9_4/local/runrsf.0002
Node 3: error opening file /home/jr241/gmx381_new_iter9_4/local/runrsf.0003
I resubmitted the job for gmx381_new_iter9_3 in directory temp. the job stopped
again. The error message is :
MPI_Recv: process in local group is dead (rank 7, MPI_COMM_WORLD)
MPI_Recv: process in local group is dead (rank 6, MPI_COMM_WORLD)
MPI_Recv: process in local group is dead (rank 11, MPI_COMM_WORLD)
MPI_Recv: process in local group is dead (rank 10, MPI_COMM_WORLD)
Rank (7, MPI_COMM_WORLD): Call stack within LAM:
Rank (7, MPI_COMM_WORLD): - MPI_Recv()
Rank (7, MPI_COMM_WORLD): - MPI_Bcast()
Rank (7, MPI_COMM_WORLD): - MPI_Allreduce()
Rank (7, MPI_COMM_WORLD): - main()
Rank (6, MPI_COMM_WORLD): Call stack within LAM:
Rank (6, MPI_COMM_WORLD): - MPI_Recv()
Rank (6, MPI_COMM_WORLD): - MPI_Bcast()
Rank (6, MPI_COMM_WORLD): - MPI_Allreduce()
Rank (6, MPI_COMM_WORLD): - main()
Rank (10, MPI_COMM_WORLD): Call stack within LAM:
Thanks...
_______________________________________________________________
EDAG Engineering + Design, Inc.
31701 Research Park Drive
Madison Heights, MI 48071
USA
Louis Gonzales
IT Manager/BSCS
Phone: +1 248-577-4009
Mobile: +1 248-379-3299
Fax: +1 248-588-3259
e-mail:Louis.Gonzales_at_[hidden]
|