LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Peter Skaarup (piparum_at_[hidden])
Date: 2003-06-15 16:26:13


Dear LAM implementors

I am writing my master thesis in computer science. In this thesis
I have run some test programs which measured the porformance of
some LAM/MPI programs.

I got some strange results from these experiments. Reading the LAM
source code revealed why I saw the results I did, but raised the
question of why LAM is implemented as it is.

The graph at
www.daimi.au.dk/~piparum/beogobbler/series58/plot58.ps
is a result of a program that measured the time (y axis) it took
LAM/MPI to send x bytes (x axis) from n0 to n1 and back (divided
by two).

At 65536/65537 there is a change in performance. A quick search
through the source code (version 6.5.9) revealed that the there
is a line
#define LAM_TCPSHORTMSGLEN 65536
and that define is used in rpi/tcp/rpi_tcp.c, where it is used to
select the code for short message '<= 65536' or long message
'> 65537'.

My question is, why do you define this as the border between short
or long messages? It seems that TCP fracments the MPI message, but
into bits that fit the MTU size, which is about 1500 bytes on most
Ethernets. And with a message of 65536 bytes (a short one by
definition) 65560 bytes will be passed to the TCP layer of the
communication because of teh 24 bytes MPI header.

Can you explain the reason for the design choise concerning the
short message size?

Yours
Peter Skaarup
Student at Department of Computer Science, Aarhus Univerity, Denmark.
e-mail: piparum_at_[hidden]