On Wed, 3 Sep 2003, Pablo Milano wrote:
> I tried adding a call to MPI_Test in the inner loop and I saw some
> performance increment. At this point, both blocking and nonblocking have
> almost the same performance.
It very much depends on your application. If your MPI processes are
tightly synchronized, there probably isn't much that LAM can do to
optimize the process. The best that you can do is verify that your
communication patterns are not blocking in nature or cascading (such as
the traditional red/black neighbor example scenarios, etc.). In such
cases, using non-blocking forms of communication can be a big win. Since
you say that your blocking performance is similar to the non-blocking
performance, I'm guessing that this is not the case, but I mention it
anyway. :-)
Also -- double check how much this really matters. If you're
communicating infrequently, then optimizing your communication won't
matter much to the overall run time. If you're communicating a lot, then
this does matter and it is worth optimizing. For example: do a quick WAG
measurement of how much data you're pushing around and how much time that
should take to transmit across your media (e.g., if you're sending a total
of 10MB over 100Mbps ethernet, estimate that you'll get 85-90Mbps
throughput and calculate the time necessary to send it). LAM won't add
much overhead to the raw TCP speeds, so it's a good enough estimate.
If this time is a small portion of the overall run time, then tweaking
your communication to gain a few ms here and there isn't worth it.
> I was thinking about replacing the basic nonblocking Isend and Irecv by
> Persistent Communications. Do you think I will get some performance
> benefit in LAM-MPI using this? Would I have the same TCP problems?
The only difference between persistent and regular non-blocking
communication is the setup cost. LAM separates out the setup cost in
persistent communication, so if you use the *same* buffer and parameters
for every send in your iteration, you may see a slight speedup. However,
due to TCP latencies, this speedup may be negligible -- YMMV.
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|