On Apr 20, 2005, at 10:07 AM, Kumar, Ravi Ranjan wrote:
> I wish to reduce the wall-clock runtime as much as possible by
> overlapping comm
> with computaion. For this I used non-blocking MPI_Isend/Irecv with
> MPI_wait().
> Still my code is not showing good scaling? What can be the reason for
> the poor
> scalability? Is it due to the reason that my code is not well
> optimized? Is
> there any other way I can optimize it more? Pls see the pseudo code for
> overlapped_comm_comp_subrouine and suggest me if anything is wrong
> with the
> code.
I can't say why you aren't seeing good scaling - you might want to use
a tracing system like MPE from Argonne National Lab to determine where
you are spending your time. It could be something inherent in your
algorithm or it could be poor overlap of computation and communication.
LAM/MPI (like most MPI implementations, especially the free ones)
doesn't provide good overlap of computation and communication for TCP.
It's better, but not great, for GM. This is one of the things we are
working on for Open MPI - we should have real asynchronous progress,
even for TCP, when threads are available on a system.
If your messages are bigger than a couple of bytes, it's possible that
you are blocking for a while in MPI_Wait because only a little progress
was made during the MPI_Isend and the rest has to be made during
MPI_Wait. Which largely defeats the purpose of all the work. If this
is the case (use something like MPE to verify), there are a couple of
options:
1) Buy an MPI Implementation with real progress during computation.
For
Linux, I believe MPI/Pro is the only one that currently does.
2) Wait for Open MPI to have a stable release (sometime in 2005).
3) Use MPI_Test at various points in your algorithm to give the
MPI implementation a chance to make (non-blocking) progress
during the computation phase of your algorithm.
#3 is probably your best bet, as it will work equally well for LAM/MPI
and MPICH and shouldn't perturb your results too much on an MPI that
has good asynchronous progress.
Hope this helps,
Brian
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
|