Thanks Timothy for your very detailed answer, that's exactly what
I was looking for. I'll fetch all the pieces and let you know
how it goes.
Cheers,
Roberto
On Tue, 22 Apr 2003, Timothy I Mattox wrote:
> Hi Roberto,
> What your triangle network topology looks like is a switchless
> FNN (Flat Neighborhood Network), at least in our research's terminology.
> (See http://aggregate.org/FNN/) There is a patch to LAM to make it work
> with FNNs, it is in the 6.6b1 beta, as well as any recent CVS versions.
> In those versions of LAM, there is this option to lamboot that will help
> make things work for your multiple NIC per node setup:
> -l Use local hostname resolution (vs. centralized name lookup)
> You will need to assign unique IP addresses for each NIC, and set up the
> /etc/hosts and /etc/nsswitch.conf files on each node as follows:
> NIC info for Node A:
> eth0 NETWORK=10.0.1.0 IPADDR=10.0.1.1 (connected to node B's eth0)
> eth1 NETWORK=10.0.2.0 IPADDR=10.0.2.1 (connected to node C's eth0)
> NIC info for node B:
> eth0 NETWORK=10.0.1.0 IPADDR=10.0.1.2 (connected to node A's eth0)
> eth1 NETWORK=10.0.3.0 IPADDR=10.0.3.2 (connected to node C's eth1)
> NIC info for node C:
> eth0 NETWORK=10.0.2.0 IPADDR=10.0.2.3 (connected to node A's eth1)
> eth1 NETWORK=10.0.3.0 IPADDR=10.0.3.3 (connected to node B's eth1)
> With each NETMASK as 255.255.255.0, thus yeilding 3 subnets, one
> for each edge of your triangle.
>
> For /etc/hosts you need to give each node a personalized list of IP
> addresses for its neighbors in the cluster like this:
> /etc/hosts for node A:
> 10.0.1.1 nodeA (this is mostly a placeholder so nodeA knows itself)
> 10.0.1.2 nodeB
> 10.0.2.3 nodeC
> /etc/hosts for node B:
> 10.0.1.1 nodeA
> 10.0.1.2 nodeB (this is mostly a placeholder so nodeB knows itself)
> 10.0.3.3 nodeC
> /etc/hosts for node C:
> 10.0.2.1 nodeA
> 10.0.3.2 nodeB
> 10.0.2.3 nodeC (this is mostly a placeholder so nodeC knows itself)
>
> For /etc/nsswitch.conf, you need to make sure that the "hosts" line
> has "files" as the first choice for name resolution. For example:
> hosts: files dns
>
> Now, when you do a lamboot, use the -l option and give it a lamhosts
> file containing:
> nodeA
> nodeB
> nodeC
>
> Then use mpirun as normal. Each node will talk to it's neighbors thru
> the appropriate NIC. Yeah, it's a lot of work to do by hand to set
> this up. I'm still working on a simpler approach for FNN use and setup.
>
> I hope that was detailed enough to get things working for you. :-)
>
> Oh, one more thing, you may want to turn on the ARP filter on each node
> with these additional lines in the /etc/sysctl.conf file on each node:
> #turn on ARP filters
> net.ipv4.conf.all.arp_filter = 1
> net.ipv4.conf.default.arp_filter = 1
>
> Linux has a nasty default setting that can mess up the proper behaviour
> of the ARP protocol in more "interesting" networks such as this.
> In your specific case this APR filter isn't necissary, but for general
> FNNs this is a must to get any acceptable performance.
>
> Enjoy!
> --
> Tim Mattox - tmattox_at_[hidden] - http://home.earthlink.net/~timattox
> http://aggregate.org/KAOS/ - http://advogato.org/person/tmattox/
>
> On Mon, 21 Apr 2003, R.C.Pasianot wrote:
> > Hello there ,
> >
> > Seem to recollect this has been asked before, would someone please
> > either point me to the right place (sorry, searching using our internet
> > connection is a real pain) or give me a quick answer ?.
> >
> > Here's the scenario. I have 3 hosts, A,B, and C, each one furnished
> > with 2 NICS. So I connect them as if on the vertices of a triangle
> > (don't have a spare switch ):
> >
> > A
> > / \
> > / \
> > / \
> > B-------C
> >
> > Now I "lamboot" say from A and want all the hosts to communicate among
> > themselves using the shortest paths, namely, the sides AB, BC and CA.
> >
> > Is it possible to do this ?. How would a lamhosts file look like ?.
> >
> > Thanks a lot. Regards,
> >
> > Roberto
> >
> >
> > _______________________________________________
> > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|