LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Ross Torkington (ELA01RMT_at_[hidden])
Date: 2004-04-16 13:11:51


Thank you Bogdan for your reply and my apologies for not explaining my
situation cleary. This is largely due to me being a novice in this area.

I have four nodes connected by a 10 Base 2 Network via BNC cable. I am
timing how long it takes for MPI_Send to return from sending messages
between 0.1KB and 200KB. Initially the results are as expected with the
time required increasing gradually with message size. However, when the
message size reaches 90KB (sometimes 100KB) there is a significant jump of
about 30ms in the send time. This is then followed by a 29ms jump at 130KB
and a 25ms jump at 160KB, producing a staircase on my graph. It is these
jumps I don't understand.

My guess is that the delays occur when a communication buffer needs to be
refilled but as the default buffer size is 64KB, why is the first jump
nearer 100KB and subsequent jumps every 30KB or so?

If it helps, the four computers I am using are very low spec - 100MHz
machines with 32 to 96MB of RAM running Red Hat Linux.

Thanks very much for your help!
Ross

On Thu, 1 Apr 2004, Ross Torkington wrote:

> I've been timing how long it takes to send various sized messages

Send over what ? Fast Ethernet, Gigabit Ethernet, Myrinet (GM or IP
over GM), Infiniband, shared memory - these are all supported by LAM
(and sorry if I forgot some) and have quite different communication
paramaters.

> and found that the cost per kb begins high for short messages, drops
> sharply

This is quite counterintuitive. But without a good definition of
"short messages", it's still unclear. Latency tests usually start with
a MPI payload of zero bytes (which means a message that includes the
MPI envelope but no data). Have you included this in your tests ? If
you are talking about kilobyte sized messages and if you use Ethernet,
some effect of going over the MTU of the Ethernet interface might be
visible as the TCP streams have to be chunked at transmitter and
reassembled at receiver.

> and remains at around 50us for messages between 10 and 100kb

>From this, I assume that it is indeed Ethernet, but what kind ? Are
you using Jumbo frames ?

> but then increases in a staircase fashion beyond that.

If it's indeed TCP over Ethernet, most likely it's the network drivers
and/or TCP/IP stack that have a non-linear behaviour. For example,
recently some Linux network drivers have gotten "NAPI" capabilities
which means that at a high interrupt rate generated for receiving
packets, they switch to an interrupt-less mode where some polling is
done to detect whether or not new packets have arrived; if the load is
then reduced, it goes back to interrupt mode - this can kick in and
out without any control from user processes and can introduce serious
measuring artifacts. I'm not saying that this is what happened during
your tests, but this is just an example of how things can go out of
control.

-- 
Bogdan Costescu