LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-06-22 22:53:30


I seem to recall posting the details of this kind of stuff to the list
before -- have you searched the list archives?

Without diving into the details (they are sordid, twisted, and
complicated), note that MPI_Send is *allowed* to block, but it doesn't
always *have to block*. It is alwasys safest to assume that MPI_Send
*will* block. Indeed, Example 3.9 in MPI-1 (p33) gives a good example of
an unsafe program and cites the fact that if you replace all standard
sends with synchronous sends, a "safe" program will not deadlock.

You may want to look for situations like that in your code. If you have
similar situations, you may wish to consider using non-blocking MPI
communication.

On Tue, 22 Jun 2004, Thomas Lavergne wrote:

> Dear all,
> I am running into an odd behaviour of my code which, from time to time, hangs
> on communication. It does so only when several processes are launched by
> mpirun on each node (we have bi-procs, here so I try to use them with cpu=2
> in the hosts_list). In order to reproduce/locate/understand this behaviour, I
> would like to access info concerning the LAM implementation of MPI_Send
> blocking communication scheme. Particularly, I would like to know how does
> LAM react as many MPI_Send's (with possibly large messages) are call without
> any MPI_Recv to gather them. For example, are there several reactions,
> depending on the amount and size of previously buffered messages?
> I have seen the FAQ and documentation pages but could only find that: "The
> LAM Team [...] will more than likely only be able to direct you to relevant
> parts of the LAM source code." So... where should I start, please?
>
> Many thanks
> Thomas
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/