LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: jess michelsen (jam_at_[hidden])
Date: 2004-01-06 12:03:41


Hi Everyone!

I'm solving a large CFD problem on different numbers of processors. The
CFD problem is almost perfectly load-balanced with 84 blocks of equal
size (and workload). Timings for CPU numbers up to 42 are reasonable,
although one begins to see the effect of communication time.

However, when I increase # CPUs from 42 to 84, the job stalls
completely. Time increases from around 60 seconds to almost 13.000
seconds (!). A closer look at the times for individual parts of the job
reveals, that a limited number of calls (approximately 120 calls) to
MPI_ALLGATHERV is responsible for the entire growth of time consumption.
I double-checked this conclusion by leaving out these calls (this
changes the computed results slightly), and the time was again around
the 60 seconds.

Each call to MPI_ALLGATHERV gathers about 43 Kb of double-precision
data, equal amount from each processor.

Since the gather operation is obviously needed here, could anything be
done to alleviate this situation - has anybody witnessed anything the
like?

Best regards, Jess Michelsen