Wait(C) does not necessarily imply Wait(B) -- MPI says that you have to
wait for both before you can read those buffers. It would be
concerning if you do Wait(C) followed by Wait(B) and still get the
wrong results.
Can you try that?
On Dec 9, 2004, at 6:55 AM, Brad Penoff wrote:
> Greetings,
>
> We have been studying the LAM TCP RPI state machine and in our testing
> we have uncovered some LAM behavior that we wanted to discuss. Say we
> have two nodes, one process each. When sending messages from n0 to
> n1, according to MPI semantics, the messages must be non-overtaking if
> they have the same tag. In the case of non-blocking, order is defined
> by the execution order of the initiating calls. Bearing this in mind,
> say n0 and n1 executed the following pseudocode:
>
>
> size of bufA, B, C, D, E are respectively 200, 800000, 300, 6600, 100
> ___
> n0
> ___
> from = 1;
> tag = 0;
> Irecv(bufA, from, tag)
> Irecv(bufB, from, tag)
> Irecv(bufC, from, tag)
> Irecv(bufD, from, tag)
> Irecv(bufE, from, tag)
> Wait(C)
> compute(C+B)
> WaitAll()
>
> ___
> n1
> ___
> to = 0
> tag = 0
> Isend(bufA, to, tag)
> Isend(bufB, to, tag)
> Isend(bufC, to, tag)
> Isend(bufD, to, tag)
> Isend(bufE, to, tag)
> compute()
> WaitAll()
>
>
> ------------------------------------
>
> We are seeing cases where the Wait(C) call returns (in n0) where the
> receive request associated with bufB has not yet been completed. This
> means our compute(B+C) statement gets messed up. We understand that B
> must rendezvous, but regardless, MPI semantics don't make special
> cases for large messages, and we are wondering if it is here...
>
> Obviously, we could also have also done a Wait(B), but we had thought
> that Wait(C) implied Wait(B) due to MPI's nonblocking semantics.
>
> Are we seeing anomalous behavior here?
>
> Thanks,
> brad
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|