LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Michael Lees (mhl_at_[hidden])
Date: 2005-08-17 06:45:04


Hi all,

Okay, it seems the design I have at present is unsafe and unsafe to the
point where it actually breaks.

I don't know if it's possible to do what I want with MPI? To me it seems
the most basic form of asynchronous message passing.

There is an input and output queue, there needs to be some number of
threads which service these queues by making MPI_Send calls for messages
in the output queue and placing messages into the input queue everytime
a MPI_Recv completes.

with MPI_THREAD_MULTIPLE (as far as I can tell) the solution would be
simple...

Inthread{
MPI_Recv(msg)
InQueue.push(msg)
}

Outthread{
cond_wait(cv)
OutQueue.pop(msg)
MPI_Send(msg)
}

Then every insertion into the out queue signals the condition variable
cv. Because in the inthread the MPI_Recv is or will eventually be posted
the sends will never deadlock. Now obviously I can't do the above but I
previously thought the single thread solution I had was the best
approximation. But as I've discovered the looping sends cause a problem
- ie., deadlock.

So I've been trying to think of a safe solution to the problem above. I
came up with

inOutThread
{
MPI_IRecv
if(!outQ.empty())
   MPI_ISend

MPI_Waitany(index)

if(index = recv)
inQ.push()
}

This is incomplete - I think I'd have to cancel whichever of the Irecv
and Isend didn't complete at the beginning of each loop? Would this be
inefficient? The other option would be to check which of the operations
completed and post a new one.

The other problem with this is if the out queue is empty and no recv
occurs we're stuck in the the MPI_Waitany call. Is it safe to call
MPI_Cancel from another thread when something is inserted into the out
queue?

Sorry for the general questions I know it's not lam specific but I'm
quite stuck.

Thanks again

--Mike

David Cronk wrote:
>
> Michael Lees wrote:
>
>>Hello again,
>>
>>What is the default behaviour when two MPI processes try to perform an
>>MPI_Send at the same time?
>>
>>Is this type of behaviour allowed in MPI, if so does it require the use
>>of non-blocking sends (or some type of buffer?).
>>
>>I've managed to achieve deadlock with two MPI processes stuck in
>>MPI_Send calls to one another? After outr previous discussions I would
>>have thought the sends would have completed by placing the messages into
>>a buffer before sending? However, it seems both MPI_Sends are stuck in
>>the spinklock() in lam_ssi_rpi_usysv_low_fastsend()?
>
>
> MPI can ONLY place the messages in a buffer if there is enough buffer
> space available. If there is insufficient buffer space, it will block.
> The only other choice would be to fault, and I don't think anyone
> would suggest that would have been a good choice by the MPI forum.
>
> The bottom line is, any code that relies on ANY buffer space is unsafe.
> You cannot assume any buffer space will be available. A way I
> sometimes present this when I teach MPI classes is: If there is a cycle
> of blocking sends (remembering that a send/wait pair qualifies as a
> blocking send), the program is unsafe.
>
> Hope this helps.
>
> Dave.
>
> Dave.
>
>
>>Thanks for all your help
>>
>>Mike
>>
>>
>>
>>Michael Lees wrote:
>>
>>
>>>Jeff Squyres wrote:
>>>
>>>
>>>
>>>>On Aug 12, 2005, at 11:22 AM, Michael Lees wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>>>I have a thread for performing asynchronous sending and receiving,
>>>>>>>something along the lines of...
>>>>>>>
>>>>>>>while(running){
>>>>>>>MPI_IRecv(&request)
>>>>>>>While(recvflag ==0){
>>>>>>>MPI_Test(request,recvflag)
>>>>>>>MPI_Send()
>>>>>>>yield()
>>>>>>>}
>>>>>>>yield()
>>>>>>>}
>>>>>>
>>>>>>
>>>>>>Wow, that's a lot of sending. :-)
>>>>>
>>>>>Is it though? As far as I understand it the blocking send will wait
>>>>>until a matching receive is posted? Am I wrong? So they'll only be one
>>>>>send per receive?
>>>>
>>>>
>>>>No, not necessarily. Check out the sidebar entitled "To block or not
>>>>to block" in this column for an explanation:
>>>>
>>>> http://cw.squyres.com/columns/2004-02-CW-MPI-Mechanic.pdf
>>>>
>>>>Short version: if you are sending a short message, you may actually be
>>>>sending many times before your receive completes (it's a parallel race
>>>>condition). If you're trying to simply match a single send and a
>>>>single receive, you might want to use MPI_SENDRECV or a pattern similar
>>>>to:
>>>>
>>>>MPI_Irecv(...)
>>>>MPI_Isend(...);
>>>>MPI_Waitall(...);
>>>>
>>>>
>>>
>>>
>>>
>>>I was thinking some more about the issue of sudden slow down in my
>>>application. After reading about the Buffered implementation of MPI_Send
>>>and MPI_Recv I was wondering if my performance could be reducing when
>>>the buffer fills?
>>>Each message is 132 bytes so the thread design above will allow a lots
>>>of sends to occur before the internal buffer is full(the buffer is
>>>default - which is 64k?). This could result in 496 messages being sent
>>>before the buffer fills up.
>>>
>>>What happens when a small message stored in the buffer is sent - does it
>>>empty the used buffer immediately? Or is the buffer reclaimed when
>>>needed ie., when full, or is there some type of garbage collection?
>>>
>>>I can't figure out what is causing the sudden massive drop in
>>>peformance. I've used gkrellm to monitor memory and cpu usage and both
>>>seem fairly constant and no paging is done at all. The other odd thing
>>>is that it doesn't matter if I allocate 11 processes to one cpu or 11
>>>processes to 3 cpus - the performance drop happens at about the same point.
>>>
>>>Is there a decent free tool for monitoring/profiling mpi programs?
>>>
>>>Cheers
>>>
>>>-Mike
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>This message has been checked for viruses but the contents of an attachment
>>>may still contain software viruses, which could damage your computer system:
>>>you are advised to perform your own checks. Email communications with the
>>>University of Nottingham may be monitored as permitted by UK legislation.
>>>
>>>_______________________________________________
>>>This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>>This message has been checked for viruses but the contents of an attachment
>>may still contain software viruses, which could damage your computer system:
>>you are advised to perform your own checks. Email communications with the
>>University of Nottingham may be monitored as permitted by UK legislation.
>>
>>_______________________________________________
>>This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>
>

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.