On Thu, Nov 13, 2003 at 05:32:24PM +0100, jess michelsen wrote:
>and 850Mbit/sec bandwidth, full duplex, (both NetPipe and my own Fortran
>application - NetPipe TCP numbers are exactly the same).
sounds good!
>What might be the difference between the 4.4.12 and 4.4.12-k1 e1000
>versions?
I sent you a patch off-list, but I think the summary is that -k1 is an
intermediate/unofficial NAPI patch.
>Now, that I got the right performance of the communication, I've tried
>to overlap the communication by some computations. The computations are
>- like our CFD applications - memory-bound (moving a couple of large
>arrays in and out of cache). Is this the reason that the effect of
>overlapping communications and computations show only marginal reduction
>(up to 20%) of the communication time (sum of time-difference of the 3
>MPI calls)?
Hmmm - I'd have thought a 20% improvement was pretty good?
Asynchronous messages probably use more cpu and more memcpy's than the
more 'direct' MPI_Send/Recv so if you are already memory-bound then yeah,
maybe that's the problem.
Maybe the Llamas can suggest a better send/recv mode for your situation.
Using large messages (>>50KB) is usually a big win as they are bandwidth
not latency dominated.
cheers,
robin
|