LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Robin Humble (rjh_at_[hidden])
Date: 2003-11-13 12:32:32


On Thu, Nov 13, 2003 at 05:32:24PM +0100, jess michelsen wrote:
>and 850Mbit/sec bandwidth, full duplex, (both NetPipe and my own Fortran
>application - NetPipe TCP numbers are exactly the same).

sounds good!

>What might be the difference between the 4.4.12 and 4.4.12-k1 e1000
>versions?

I sent you a patch off-list, but I think the summary is that -k1 is an
intermediate/unofficial NAPI patch.

>Now, that I got the right performance of the communication, I've tried
>to overlap the communication by some computations. The computations are
>- like our CFD applications - memory-bound (moving a couple of large
>arrays in and out of cache). Is this the reason that the effect of
>overlapping communications and computations show only marginal reduction
>(up to 20%) of the communication time (sum of time-difference of the 3
>MPI calls)?

Hmmm - I'd have thought a 20% improvement was pretty good?

Asynchronous messages probably use more cpu and more memcpy's than the
more 'direct' MPI_Send/Recv so if you are already memory-bound then yeah,
maybe that's the problem.

Maybe the Llamas can suggest a better send/recv mode for your situation.

Using large messages (>>50KB) is usually a big win as they are bandwidth
not latency dominated.

cheers,
robin