Greetings,
We have been studying the LAM TCP RPI state machine and in our testing we have
uncovered some LAM behavior that we wanted to discuss. Say we have two nodes,
one process each. When sending messages from n0 to n1, according to MPI
semantics, the messages must be non-overtaking if they have the same tag. In
the case of non-blocking, order is defined by the execution order of the
initiating calls. Bearing this in mind, say n0 and n1 executed the following
pseudocode:
size of bufA, B, C, D, E are respectively 200, 800000, 300, 6600, 100
___
n0
___
from = 1;
tag = 0;
Irecv(bufA, from, tag)
Irecv(bufB, from, tag)
Irecv(bufC, from, tag)
Irecv(bufD, from, tag)
Irecv(bufE, from, tag)
Wait(C)
compute(C+B)
WaitAll()
___
n1
___
to = 0
tag = 0
Isend(bufA, to, tag)
Isend(bufB, to, tag)
Isend(bufC, to, tag)
Isend(bufD, to, tag)
Isend(bufE, to, tag)
compute()
WaitAll()
------------------------------------
We are seeing cases where the Wait(C) call returns (in n0) where the receive
request associated with bufB has not yet been completed. This means our
compute(B+C) statement gets messed up. We understand that B must rendezvous,
but regardless, MPI semantics don't make special cases for large messages, and
we are wondering if it is here...
Obviously, we could also have also done a Wait(B), but we had thought that
Wait(C) implied Wait(B) due to MPI's nonblocking semantics.
Are we seeing anomalous behavior here?
Thanks,
brad
|