LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Kumar, Ravi Ranjan (rrkuma0_at_[hidden])
Date: 2005-03-29 13:12:30


Hello,

I wrote a code in C++ using MPI. I divided a bigger block into smaller blocks
and assigned each block to different node. I wish to run my code simultaneoulsy
on different nodes in the lam. For this I wrote code:

for(time=1;time<=Nt;time++)

{

do {

   //some data exchange between neigbouring (blocks) nodes

   //some computation in each block (node/process)
   
   MPI_Allreduce(...to find convergence condition...);
 
  } while(convergence reached)

MPI_Barrier(..);

}

I want to have results from different processes at each time step then move to
next time step. Next time step requires result from old time step.

For this, I want increment in 'time' simultaneously on all the nodes/processes
that is why I am using MPI_Barrier but this logic doesnt seem to work.

When I run 10 processes on single node, all the processes end up simultaneously
but when I run 10 processes on 5 nodes (2 processes per node), synchronization
fails. Rank 1 & 6 lags behind where as rest of the ranks finish their work
quite early. But when I use less number of processe/nodes (say 3 to 5),
syncronization works well even without using MPI_Barrier.

I do not understand one thing, why this data exchange (MPI sending receiving
routines) doesnt work as syncronization tool. I understood that 'time' loop can
not proceed further unless data exchange among all the processes are complete
which in other sense will enforce syncronization too.

How I can acheive synchronization so that all the processes move in parallel?
Is there any other MPI routine by which I can acheive this?

Pls help me.

Thanks a lot!

Ravi R. Kumar