LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian W. Barrett (brbarret_at_[hidden])
Date: 2003-06-17 14:28:14


On Sun, 15 Jun 2003, Peter Skaarup wrote:

> At 65536/65537 there is a change in performance. A quick search through
> the source code (version 6.5.9) revealed that the there is a line
> #define LAM_TCPSHORTMSGLEN 65536 and that define is used in
> rpi/tcp/rpi_tcp.c, where it is used to select the code for short message
> '<= 65536' or long message '> 65537'.
>
> My question is, why do you define this as the border between short
> or long messages? It seems that TCP fracments the MPI message, but
> into bits that fit the MTU size, which is about 1500 bytes on most
> Ethernets. And with a message of 65536 bytes (a short one by
> definition) 65560 bytes will be passed to the TCP layer of the
> communication because of teh 24 bytes MPI header.

The change in performance is due to two different protocols being used in
MPI_Send - a "short" protocol and a "long" protocol. As you discovered,
the cross-over point is just about 64K - which happens to be around the
maximum size you can assume to be able to set a buffer on a TCP socket
across all the platforms on which LAM runs.

The long protocol does not start sending the actual data until a matching
receive has been posted on the other side. While there are a bunch of
reasons for this behavior, one of the more compelling reasons is the
buffer behavior of MPI. Without the long protocol, LAM could be put in a
situation where it has to receive a large message off the TCP connection
with no user buffer to put it in - resulting in the MPI implementation
malloc()ing a rather large amount of memory (and later memcpy()ing that
message into the user buffer). This really isn't a great idea and can
kill performance. Hence, the long protocol.

Hope this helps,

Brian

-- 
  Brian Barrett
  LAM/MPI developer and all around nice guy
  Have a LAM/MPI day: http://www.lam-mpi.org/