LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Michael Lees (mhl_at_[hidden])
Date: 2005-08-30 07:47:04


> I have an iterative algorithm on master that sends the data to the slaves in the beginning of every iteration, the slaves return the results before the master starts the new iteration.
> Currently the code goes as follows:
> for (iter=1;iter<NumIter; iter++)
> {
> if (myid==0)
> //masters part
> if (myid!=0)
> //slaves part
> }
> This code works good, but if I change "for" loop to the "while" loop, the algorithm fails. The error msg is following:
>
> MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD)
> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
> Rank (1, MPI_COMM_WORLD): - MPI_Recv()
> Rank (1, MPI_COMM_WORLD): - main()

Do you have any barriers at startup and shutdown...
If the code is simply

> for (iter=1;iter<NumIter; iter++)
> {
> if (myid==0)
> //masters part
> if (myid!=0)
> //slaves part
> }

you could try...

MPI_Init
...
MPI_Barrier(MPI_COMM_WORLD)
  for (iter=1;iter<NumIter; iter++)
    {
      if (myid==0)
        //masters part
      if (myid!=0)
        //slaves part
    }
MPI_Barrier(MPI_COMM_WORLD)
...
MPI_Finalize

This may prevent processes shutting down too early.
Do you know why one of the processes has died? Have you ran it through a
debugger?

-Mike

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.