LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-08-16 11:49:16


My previous reply was with TCP in mind, but there are, of course,
multiple different underlying communications mechanisms (e.g., shared
memory). Each one will have a different amount of buffering (e.g., TCP
defaults to 64k; the shared memory transports default to less than
that).

It sounds like your program is assuming more buffering than exists --
as mentioned in the column I pointed you to, MPI_Send *may* block. One
reason that MPI_Send can block is that it is awaiting a corresponding
receive (which is what sounds like is happening in your case).

Check out the MPI semantics of sending defined in the MPI-1 standard:

http://www.mpi-forum.org/docs/mpi-11-html/node40.html#Node40
http://www.mpi-forum.org/docs/mpi-11-html/node41.html#Node41

On Aug 16, 2005, at 12:16 PM, Michael Lees wrote:

> Hello again,
>
> What is the default behaviour when two MPI processes try to perform an
> MPI_Send at the same time?
>
> Is this type of behaviour allowed in MPI, if so does it require the use
> of non-blocking sends (or some type of buffer?).
>
> I've managed to achieve deadlock with two MPI processes stuck in
> MPI_Send calls to one another? After outr previous discussions I would
> have thought the sends would have completed by placing the messages
> into
> a buffer before sending? However, it seems both MPI_Sends are stuck in
> the spinklock() in lam_ssi_rpi_usysv_low_fastsend()?
>
> Thanks for all your help
>
> Mike
>
>
>
> Michael Lees wrote:
>>
>> Jeff Squyres wrote:
>>
>>> On Aug 12, 2005, at 11:22 AM, Michael Lees wrote:
>>>
>>>
>>>
>>>>>> I have a thread for performing asynchronous sending and receiving,
>>>>>> something along the lines of...
>>>>>>
>>>>>> while(running){
>>>>>> MPI_IRecv(&request)
>>>>>> While(recvflag ==0){
>>>>>> MPI_Test(request,recvflag)
>>>>>> MPI_Send()
>>>>>> yield()
>>>>>> }
>>>>>> yield()
>>>>>> }
>>>>>
>>>>>
>>>>> Wow, that's a lot of sending. :-)
>>>>
>>>> Is it though? As far as I understand it the blocking send will wait
>>>> until a matching receive is posted? Am I wrong? So they'll only be
>>>> one
>>>> send per receive?
>>>
>>>
>>> No, not necessarily. Check out the sidebar entitled "To block or not
>>> to block" in this column for an explanation:
>>>
>>> http://cw.squyres.com/columns/2004-02-CW-MPI-Mechanic.pdf
>>>
>>> Short version: if you are sending a short message, you may actually
>>> be
>>> sending many times before your receive completes (it's a parallel
>>> race
>>> condition). If you're trying to simply match a single send and a
>>> single receive, you might want to use MPI_SENDRECV or a pattern
>>> similar
>>> to:
>>>
>>> MPI_Irecv(...)
>>> MPI_Isend(...);
>>> MPI_Waitall(...);
>>>
>>>
>>
>>
>>
>> I was thinking some more about the issue of sudden slow down in my
>> application. After reading about the Buffered implementation of
>> MPI_Send
>> and MPI_Recv I was wondering if my performance could be reducing when
>> the buffer fills?
>> Each message is 132 bytes so the thread design above will allow a lots
>> of sends to occur before the internal buffer is full(the buffer is
>> default - which is 64k?). This could result in 496 messages being sent
>> before the buffer fills up.
>>
>> What happens when a small message stored in the buffer is sent - does
>> it
>> empty the used buffer immediately? Or is the buffer reclaimed when
>> needed ie., when full, or is there some type of garbage collection?
>>
>> I can't figure out what is causing the sudden massive drop in
>> peformance. I've used gkrellm to monitor memory and cpu usage and both
>> seem fairly constant and no paging is done at all. The other odd thing
>> is that it doesn't matter if I allocate 11 processes to one cpu or 11
>> processes to 3 cpus - the performance drop happens at about the same
>> point.
>>
>> Is there a decent free tool for monitoring/profiling mpi programs?
>>
>> Cheers
>>
>> -Mike
>>
>>
>>
>>
>>
>>
>>
>>
>> This message has been checked for viruses but the contents of an
>> attachment
>> may still contain software viruses, which could damage your computer
>> system:
>> you are advised to perform your own checks. Email communications with
>> the
>> University of Nottingham may be monitored as permitted by UK
>> legislation.
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> This message has been checked for viruses but the contents of an
> attachment
> may still contain software viruses, which could damage your computer
> system:
> you are advised to perform your own checks. Email communications with
> the
> University of Nottingham may be monitored as permitted by UK
> legislation.
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/