Dear all,
This maybe a little off topic in this list, but as our problems are
mainly related to running LAM applications I would ask for your help.
We have a beowulf type cluster build from P4 pc's with ASUS P4C800 Delue
mainboard that includes a 3C940 Gigabit LOM as network hardware. The
installed operating system is Fedora Core 1 (linux kernel
2.4.2-1.2115.nptlsmp). For the network interface I had to install the
driver from Syskonnect as the one supplied by ASUS did not work
properly; in fact it did almost, but when I tried to lamboot with a few
nodes it didn't. With the Syskonnect driver the network seems to work,
and LAM applications start, but:
- low performance: measured with netperf using lam we only reach about
350 Mbps, only for large packet size and only when "-lamd" is specified
in mpirun.
- instabilities: for some lam aplications we had to specify "-nger" in
the mpirun, otherwise some of the copies of the programs crashed.
I'm currently using LAM 6.5.9. I tried also the latest LAM version but
with similar results. If we had to switch all our applications to LAM
7.0.x we will have to recompile everything.
Does anyone have a similar system ? Anyone has a solution to properly
handle the 3C940 Gigabit LOM card ? A better/modified driver ? Anything
?
I'm open to any ideas.
Thank you all in advance for you patience if you've read to this line.
--
Dr. Antonio M. Marquez
Dpt. of Physical Chemistry Departamento de QuÃmica FÃsica
University of Seville Universidad de Sevilla
E-41012 Seville (SPAIN) 41012 Sevilla
Phone: 34-95-4557177 (Ext. 213) Telf. 95-4557177 (Ext. 213)
Fax: 34-95-4557174 Fax. 95-4557174
|