LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Bogdan Costescu (Bogdan.Costescu_at_[hidden])
Date: 2005-11-07 09:10:21


On Thu, 3 Nov 2005, Carsten Kutzner wrote:

> The HP2848 switch has port counters where one can see how many
> packets have been dropped at each port.

You'd need to find out exactly what "dropped packets" mean for the
switch and for the network driver/hardware that you use on the nodes.
In Linux, some different events (not enough memory, corrupted Ethernet
frame, different kinds of transmission errors) are sometimes added
together and sometimes not.

Do you use these network cards and switch only for MPI traffic or is
logging on (rsh/ssg/etc.), some network file system, queueing system
or in general other type of traffic going on as well ?

Please keep in mind that it's not only the network cards and switch
that is touching the network traffic. The kernel also has to do quite
a lot of operations (splitting buffers in Ethernet frame sizes,
routing, scheduling to compensate for packet loss, etc.), so some
bottleneck might be there too...

> Wouldn't an all-to-all be nicer that is under all circumstances free
> of congestion, even if it is slightly slower for small messages?

No. Lots of MPI applications are latency-sensitive and would therefore
behave worse with such all-to-all routine.

-- 
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]