LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Pablo Milano (pablom_at_[hidden])
Date: 2003-09-05 09:02:33


Firstly, I want to thank you very much for your information.
I did some research by doing calls to MPI_Test and MPI_Testall. I found
the "optimal" amount of times I have to call them in order not to waste
time in that calls, and I compared performance values between the
"pathced" nonblocking solution and the blocking one.
In ALL the cases, with different imput values, and different buffer
sizes (big enough), the performance of both approacges is EXACTLY the
same. In other words, I couldn´t speedup my application even though
there is enough calculous time to overlap with communication.
I saw upshot graphs and the behaviour of both approaches seems the same,
but I have a pretty old version of upshot. ¿Do you know a better tool to
see alogs or clogs?.
Thanks again.

> -----Original Message-----
> From: lam-bounces_at_[hidden]
> [mailto:lam-bounces_at_[hidden]] On Behalf Of Jeff Squyres
> Sent: Wednesday, September 03, 2003 10:11 AM
> To: General LAM/MPI mailing list
> Subject: RE: LAM: RE: Performance on MPI using nonblocking
> communications
>
>
> On Wed, 3 Sep 2003, Pablo Milano wrote:
>
> > I tried adding a call to MPI_Test in the inner loop and I saw some
> > performance increment. At this point, both blocking and
> nonblocking have
> > almost the same performance.
>
> It very much depends on your application. If your MPI processes are
> tightly synchronized, there probably isn't much that LAM can do to
> optimize the process. The best that you can do is verify that your
> communication patterns are not blocking in nature or
> cascading (such as
> the traditional red/black neighbor example scenarios, etc.). In such
> cases, using non-blocking forms of communication can be a big
> win. Since
> you say that your blocking performance is similar to the non-blocking
> performance, I'm guessing that this is not the case, but I mention it
> anyway. :-)
>
> Also -- double check how much this really matters. If you're
> communicating infrequently, then optimizing your communication won't
> matter much to the overall run time. If you're communicating
> a lot, then
> this does matter and it is worth optimizing. For example: do
> a quick WAG
> measurement of how much data you're pushing around and how
> much time that
> should take to transmit across your media (e.g., if you're
> sending a total
> of 10MB over 100Mbps ethernet, estimate that you'll get 85-90Mbps
> throughput and calculate the time necessary to send it). LAM
> won't add
> much overhead to the raw TCP speeds, so it's a good enough estimate.
>
> If this time is a small portion of the overall run time, then tweaking
> your communication to gain a few ms here and there isn't worth it.
>
> > I was thinking about replacing the basic nonblocking Isend
> and Irecv by
> > Persistent Communications. Do you think I will get some performance
> > benefit in LAM-MPI using this? Would I have the same TCP problems?
>
> The only difference between persistent and regular non-blocking
> communication is the setup cost. LAM separates out the setup cost in
> persistent communication, so if you use the *same* buffer and
> parameters
> for every send in your iteration, you may see a slight
> speedup. However,
> due to TCP latencies, this speedup may be negligible -- YMMV.
>
> --
> {+} Jeff Squyres
> {+} jsquyres_at_[hidden]
> {+} http://www.lam-mpi.org/
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>