Brian Barrett writes:
>On Thu, 29 Nov 2001, Martin Knoblauch wrote:
>> does LAM-MPI have the equivalent of the MPI_MSGS_PER_PROC environment
>> variable in IRIX or Unicos? I am trying to debug a problem where a
>> application fails async. send/rec operations. Increasing those variables
>> on IRIX did help the problem. My environment is
>> lam-6.5.4/usysv/linux-2.4.9-ac18.
>I'm afraid I really don't know what problem the Irix MPI_MSGS_PER_PROC is
>trying to solve, so let's try going at this from another angle. What
At least back in PowerChallenge days, if you threw too many
asynchronous messages at SGI's MPI then it essentially ran out of
buffer memory in the MPI implementation. Bumping up MPI_MSGS_PER_PROC
and similar env variables delayed the errormessage/deadlock/coredump.
I don't know if LAM has such a limitation - if it does then I haven't
seen it.
In general though, you don't want to be sending 10^6 async messages and
_then_ posting recvs... try to process incoming async messages as you're
sending out more - it keeps all the queue lengths down.
>asynchronous send/recv operations. How many would you say are outstanding
>at any given time?
When I encountered this problem I wrote a buffering layer for my async
comms. It accumulated small messages into larger bundles and delayed
giving these to MPI until a certain amount of time had passed, or a
certain message length had been exceeded. On old SGIs and on some slow
(100Mbit) networks this was a win. I don't know about newer SGIs.
cheers,
robin
--
Dr Robin Humble http://www.cita.utoronto.ca/~rjh/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|