LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Ross Heikes (ross_at_[hidden])
Date: 2005-07-06 15:58:15


On Jul 6, 2005, at 7:40 AM, Brian Barrett wrote:

> On Jul 5, 2005, at 12:38 PM, Ross Heikes wrote:
>
>
>> We have a Apple Xserve with 40 nodes. We have made two internal
>> networks on this cluster.
>>
>> Suppose there are two jobs on node 5 in network "A" which communicate
>> with two jobs on say node 8 in network "B".
>>
>> My question is this:
>> is there a lam mpi command that will allow each job on node 5 to
>> communicate SIMULTANEOUSLY with corresponding job on node 8?
>>
>
> I'm not sure I understand your question. Do you want the processes in
> one job to talk to the processes in the other job? Or for the
> processes to be able to communicate within the same job, but at the
> same time? If the first, you want to look at the MPI-2 connect/accept
> dynamic process management. If the second, everything should "just
> work", so you shouldn't have any problems.
>
> Of course, if I misunderstood your question completely, please include
> some more detail.
>
>
> Brian

Hello Brian,

Each node has 2 NIC. The master node has 3 (one to connect to
internet).
The problem is that no job should wait for another job because they
are using the same network. They can use different networks.
For example, if node 5 has a job executing using one NIC (network)
then -- AT SAME TIME -- the other job should communicate
with a node 8 using a different NIC.

The following transcript says that LAM _MPI does not support rpi tcp
module for this solution.

Should we consider using OPEN MPI then?

>>> I have installed LAM 7.1 on dual-NIC computers in a cluster, and I’m
>>> wondering whether I have observed an increase in performance on
>>> version 0.6beta of the HPC challenge benchmarks (see
>>> http://icl.cs.utk.edu/hpcc/software/index.html) that I hoped to see
>>> owing to the availability of extra bandwidth. My test cluster
>>> has two
>>> switches, one for each network, and each node is connected to both
>>> networks. I followed the helpful advice of Tim Mattox (see
>>> http://www.lam-mpi.org/MailArchives/lam/msg05767.php) in using
>>> lamboot
>>> -l as well as setting up the hosts and nsswitch.conf files, but I
>>> didn’t get better benchmarks when using hosts files that listed
>>> two IP
>>> addresses for each node rather than just one IP address per node.
>>>
>>>
>>
>> As a clarification -- LAM will only use one NIC per process-pair for
>> the tcp rpi module. So it won't be doubling your bandwidth, even if
>> you have 2 networks between each node.
>>
>>
>>
>>> I do, however, notice an improvement over that of MPICH 1.2.5,
>>> which
>>> might be owing to the round-robin socket writing mentioned by Jeff
>>> Squyres (see http://www.lam-mpi.org/MailArchives/lam/msg04604.php);
>>>
>>>
>>
>> I'm unsure how MPICH implements writes across its sockets, so I can't
>> really say for sure why LAM is faster. :-)
>>
>>
>>
>>> however, Jeff also states that Open MPI will have “true simultaneous
>>> multi-device transport support” (see
>>> http://www.lam-mpi.org/MailArchives/lam/msg08737.php), which
>>> seems to
>>> imply that we still don’t have full support for multiple networks in
>>> LAM 7.1.
>>>
>>>
>>
>> Correct.
>>
>> By "true simultaneous multi-device transport support," I mean that
>> Open
>> MPI will be able to utilize all network interconnects between
>> processes
>> simultaneously. Hence, if you have 2 TCP NICs and 2 networks,
>> Open MPI
>> will be able to use both of them. In a best-case scenario, this will
>> double your bandwidth, but the tradeoffs involved make this unlikely
>> (streaming data from RAM, contention on the memory and/or PCI busses,
>> etc.).
>>
>>
>>
>>> Do I have to rewrite the HPC benchmark code to utilize multiple
>>> NICs,
>>> or am I overlooking a necessary step that would allow me to obtain
>>> better bandwidth for off-the-shelf MPI software? And if LAM
>>> automatically uses multiple networks, can I turn this off to see
>>> results owing to just one NIC per node?
>>>
>>>
>>
>> MPI hides these abstractions from you -- you don't see NICs or
>> networks, you just MPI_SEND. So it's really up to the MPI
>> implementation itself to use multiple networks (or not). LAM does
>> not
>> have this capability; sorry. :-\ Open MPI will. :-)
>>
>>
>>