On Fri, 28 May 2004, Tim Mattox wrote:
> Switches that support link aggregation or trunking cost more than
> switches that don't, at least when I last looked.
Most switches also don't allow a large number of trunks to be defined.
For a 24-ports switch there are usually only 4 or 6 trunks allowed.
> The two situations that I am aware of that would indicate use
> of trunking instead of separate networks are:
> 1) a fault tolerance setup, with the bonding mode set in one of the
> various fault tolerant modes (not the default round-robin mode).
... only if you consider this one switch as a reliable part.
Otherwise, if you loose the switch (for example, power supply problem)
you loose both slave connections and the node is unreachable. It's
much less likely for two switches to die at the same time.
> 2) The interrupt load from a single GigE can be high enough to
> saturate the CPU, thus the introduction of NAPI in more recent
> Linux kernels, as well as Jumbo packets. Not all network drivers
> support NAPI yet, and Jumbo frames are not universally supported.
There are more problems with these as well. NAPI is not the magic
solution for interrupts load problems: it's indeed supposed to reduce
the interrupt load, but introduces latency exactly where it counts
most (IMHO): on the receive path. NAPI is usually efficient for
sustained transfers and it might indeed bring an advantage if the
parallel application is of the coarse-grained type, transferring large
amounts of data but not often.
Jumbo frames might slow down the transfer as well as the network
driver needs to allocate in advance the packet buffers. On x86, memory
pages are 4 Kib and a normal 1500 bytes packet fits into a page. A 9
Kib Jumbo frame needs 3 continguous pages, which are harder to
allocate.
> Problem #3 comes about because with packets arriving so quickly, the
> driver will pull several packets from a single NIC before getting
> packets from the next NIC.
It's all about tradeoffs. Usually 2 or more NICs are paired with 2 or
more CPUs. In order to improve interrupt processing speed, NIC
interrupts can be assigned to CPUs (interrupt affinity) and several
CPUs can simultaneously receive packets from different NICs. However,
if the round-robin bonding policy is used, this still means that the
packets will need to be reordered. A better solution in this case
might be to use balance-xor bonding policy and choose the MAC
addresses such that the packets are more-or-less equally distributed
between the NICs, but this migth not be so easy to set up and it
requires a good understanding of the communication pattern of the
program - if you run several programs with different communications
paterns, things can get complicated very quickly...
--
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]
|