Hello,
Thanks for the reply!
Infact I am printing out the 'rank' and 'time' step just before calling the
MPI_Barrier. When all the ranks except 1 and 6 reach 250th time step, rank 1 &
6 just reaches 100th time step. Is this just because of the printing order or
am I doing some mistake in implementing what I want from code? Also I am using
MPI_Allreduce which should also enfore parallel stepping of all the processes.
isn't it? Pls clarify.
Thanks!
Ravi R. Kumar
Quoting Jeff Squyres <jsquyres_at_[hidden]>:
>
> On Mar 29, 2005, at 1:12 PM, Kumar, Ravi Ranjan wrote:
>
> > I wrote a code in C++ using MPI. I divided a bigger block into smaller
> > blocks and assigned each block to different node. I wish to run my
> > code simultaneoulsy on different nodes in the lam. For this I wrote
> > code:
> >
> > for(time=1;time<=Nt;time++)
> >
> > {
> >
> > do {
> >
> > //some data exchange between neigbouring (blocks) nodes
> >
> > //some computation in each block (node/process)
> >
> > MPI_Allreduce(...to find convergence condition...);
> >
> > } while(convergence reached)
> >
> > MPI_Barrier(..);
> >
> > }
> >
> > I want to have results from different processes at each time step then
> > move to next time step. Next time step requires result from old time
> > step.
> >
> > For this, I want increment in 'time' simultaneously on all the
> > nodes/processes that is why I am using MPI_Barrier but this logic
> > doesnt seem to work.
> >
> > When I run 10 processes on single node, all the processes end up
> > simultaneously but when I run 10 processes on 5 nodes (2 processes per
> > node), synchronization fails. Rank 1 & 6 lags behind where as rest of
> > the ranks finish their work quite early. But when I use less number of
> > processe/nodes (say 3 to 5), syncronization works well even without
> > using MPI_Barrier.
>
> How can you tell that MCW ranks 1 and 6 are lagging?
>
> Be aware that the output sent from remote nodes does not necessarily
> appear in any particular order on the mpirun stdout. The barrier that
> you have in your loop should force all MPI processes to be more-or-less
> exactly in step (meaning that MPI guarantees that no process leaves the
> barrier until all processes have entered the barrier).
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|