LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Maharaja Pandian (pandian_at_[hidden])
Date: 2005-01-19 12:00:43


Your mail on Jan 4:
>Another note -- after running test all night, the errors only showed up in
the usysv RPI. This is not 100% conclusive, of course -- the lack >of an
error showing up doesn't mean that the error doesn't exist -- but it does
seem to give pretty good credence to my theory that this >is a problem with
the usysv RPI somehow.

I want to run Linpack using LAM/MPI 7.1.1 on a large cluster ( more than
400 nodes -- dual processor; Gigabit ethernet connection). I cannot take a
chance in using the usysv rpi that can potentially give wrong answer.
To be safe, I am thinking to choose tcp rpi module instead of the shared
memory rpi modules sysv/usysv.
As you may know, Linpack uses MPI_BCAST. On each node - 2 cpus, only 2 mpi
tasks will run, so the system will not be oversubscribed.
When I ran a pingpong test to measure the latency and bandwidth (
node-to-node communication), there was no significant difference between
these modules. Do you have any pointer to performance results for LAM/MPI
on a large cluster?

Do you have any comments about choosing the tcp module for Linpack?

The LAM/MPI User's Guide, sec 9.4.4 - page 89, says: "Although all
communication is still layered on MPI point -to -point functions, the
algorithm attempts to maximize the use of on-node communication..."

If I choose the tcp rpi module, does the coll smp module use only tcp rpi f
or MPI point to point on-node communication?

Thanks for your help.