LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Lei_at_[hidden]
Date: 2004-12-19 23:16:09


Hi,

In parallel matrix multiplication, the C submatrices C1, C2, C3, etc
are computed using A and B submatrix pairs (A1, B1), (A2, B2),
(A3, B3), etc received from other PEs. If I loop over C1, C2, ...
in that order, my MPI_wait() may really have to spend time waiting
for the submatrix pairs (Ai, Bi) to come, even if other pairs (Aj, Bj)
have already arrived. So my questions is: how do I pick the already
arrived pairs to compute, so that my CPUs are mostly busy and
the cost of communication is partially hidden? Is MPI_probe()
the right way to go? Do I need to maintain a queue myself to manage
the skipped pairs (since they are still being communicated) so I can
come back to them at a later time?

Any suggestions are highly appreciated!

Thanks,

-Lei