LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-01-19 12:31:33


On Jan 19, 2005, at 12:00 PM, Maharaja Pandian wrote:

>> Another note -- after running test all night, the errors only showed
>> up in the usysv RPI. This is not 100% conclusive, of course -- the
>> lack of an error showing up doesn't mean that the error doesn't exist
>> -- but it does seem to give pretty good credence to my theory that
>> this >is a problem with the usysv RPI somehow.
>
> I want to run Linpack using LAM/MPI 7.1.1 on a large cluster ( more
> than 400 nodes -- dual processor; Gigabit ethernet connection). I
> cannot take a chance in using the usysv rpi that can potentially give
> wrong answer. To be safe, I am thinking to choose tcp rpi module
> instead of the shared memory rpi modules sysv/usysv.

See Brian Barrett's later mail about this: the problem is *only* with
usysv, and *only* on powerpc chips (e.g, G5, IBM POWER, etc.).
Specifically, it only happens on platforms that execute out-of-order
memory accesses.

The sysv RPI module is not susceptible -- is uses SYS V semaphores, not
spin locks, for locking.

> As you may know, Linpack uses MPI_BCAST. On each node - 2 cpus, only
> 2 mpi tasks will run, so the system will not be oversubscribed. When I
> ran a pingpong test to measure the latency and bandwidth (node-to-node
> communication), there was no significant difference between these
> modules.

Ping pong will show distinct differences between two processes on the
same node using usysv and tcp (tcp will be slower), and will definitely
show large differences between two processes on same node and two
processes on different nodes.

The performance of MPI_BCAST is an entirely different issue. MagPIe
and shmem-based collectives will perform well when you have 2 processes
per node.

> Do you have any pointer to performance results for LAM/MPI on a
> large cluster?

There are some pointers on our web site, but I'm not sure how current
they are.

> Do you have any comments about choosing the tcp module for Linpack?

You should use usysv unless you're on a G5 or other PowerPC-based
cluster.

> The LAM/MPI User's Guide, sec 9.4.4 - page 89, says: "Although all
> communication is still layered on MPI point -to -point functions, the
> algorithm attempts to maximize the use of on-node communication..."
>
> If I choose the tcp rpi module, does the coll smp module use only tcp
> rpi for MPI point to point on-node communication?

Correct.

The smp coll module uses MPI_Send and MPI_Recv (and variants) for its
communication. The shmem coll module uses direct shared memory -- the
smp coll module will use the shmem coll module "underneath" to execute
local on-node operations between multiple processes.

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/