Brian, I don´t understand the "progress behaviour" you talk
about. My program, basically works as follows:
1) Initialize data
2) Start calculus until reaching some condition
3) Start all the nonblocking send and receive operations (34 sends and
34 receives per process).
4) Continue the calculus (this step takes much more time than 2)
5) Wait for all the nonblocking operations to complete
6) If more calculus is needed, then go to 2
The blocking approach is exactly the same (same code) but in 3 the send
and receive operations are blocking and, of course, step 5 is omitted.
I use MPI Functions in 1) (MPI_Start), in 3) (MPI_Isend and MPI_Irecv)
and in 5) (MPI_Waitall)
¿Do I have to explicitly call more MPI functions in order to make the
communication progress?.
It is important to note that, if the buffer size to send/receive is
small, the nonblocking operations perform better, but as this size
increases, performance decreases and the blocking approach works better.
Please, tell me if I can provide more data to you.
Thanks.
> > -----Original Message-----
> > From: Brian W. Barrett [mailto:brbarret_at_[hidden]]
> > Sent: Wednesday, August 13, 2003 1:34 PM
> > To: General LAM/MPI mailing list
> > Subject: Re: LAM: Performance on MPI using nonblocking
> communications
> >
> >
> > On Tuesday, August 12, 2003, at 02:03 PM, Pablo Milano wrote:
> >
> > > I am testing performance on a parallel program written in
> > C++ with MPI.
> > > I compared the performance of the same program using blocking and
> > > nonblocking communications (MPI_Send/MPI_Recv on one side, and
> > > MPI_Isend/MPI_Irecv on the other).
> > > The strange point is that, if the size of the transmitted/received
> > > buffers is small, the nonblocking approach works better, but whit
> > > bigger
> > > sizes, the same approach works much worse than the blocking one.
> > > The bigger the buffer size, the worse the performance difference
> > > between
> > > blocking and nonblocking. ¿Does anybody have any experience with
> > > nonblocking communications in MPI?
> >
> > The performance characteristics you are seeing are not
> > typical for LAM.
> > There will be a slight performance difference between
> > non-blocking and
> > blocking communication, but it should be fairly small. One
> important
> > note about non-blocking communication in LAM - there is
> virtually no
> > progress made on message transmission for messages larger
> > than 64K when
> > you are not in an MPI function. Some benchmarks (very few, really)
> > will expose this limitation - but that does not sound like
> > the problem
> > you are having.
> >
> > > Another point is that, as I use nonblocking Sends that match with
> > > nonblocking receives, I might wait for the receives to end
> > and I will
> > > be
> > > sure the sends have ended too, but if I don´t wait (using
> > MPI_Wait or
> > > MPI_Waitall) for EVERY send and receive, my program leaks
> > memory. ¿Is
> > > there any way to wait only for the receives whithout
> > leaaking memory?.
> >
> > You should read Section 3.7.3 of the MPI-1 standard. You
> > have to free
> > the request associated with every communication. Normally, this is
> > done at the conclusion of MPI_TEST or MPI_WAIT (or one of their
> > friends). However, on the sending side, you can also free
> > the request
> > with MPI_REQUEST_FREE. Note, of course, that due to the progress
> > behavior of LAM, it is possible your message will take a very
> > long time
> > to complete if you never enter MPI functions so that LAM can make
> > progress.
> >
> >
> > Hope this helps,
> >
> > Brian
> >
> > --
> > Brian Barrett
> > LAM/MPI developer and all around nice guy
> > Have a LAM/MPI day: http://www.lam-mpi.org/
> >
> >
>
|