LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-10-06 10:09:11


On Oct 5, 2005, at 10:03 AM, Douglas Vechinski wrote:

> Yes, my test/developement set up is two Linux boxes, one with two cpus
> and the other with a single (hyperthreading) cpu which "pretends" to
> have two cpus.

There have been a lot of reports about hyperthreading in HPC -- you
might want to google around and see if it's really going to help your
application or not.

> I'm not specifying any specific RPI so I assume it defaults to
> something
> appropriate.

It's probably defaulting to usysv (spin locks for multiple processes on
the same machine) -- so WAITANY will spin if it includes both local and
remote processes.

> After the simulated error of one of the slaves, the master is not stuck
> in the 200 loop. It stays stuck within the MPI_WAITANY call and is
> sucking cpu cycles in there. I placed a write statement before and
> after the MPI_WAITANY statement to see if it is running through the 200
> loop continuously when it starts eating cpu cycles. It is not.

Ok.

You might want to check and see if WAITANY is eating up CPU for the
entire run -- the usysv RPI is designed to use spin locks so that it
will maximize performance (they're faster than SYSV semaphores, for
example). If so, then this is definitely normal / expected behavior.

-- 
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/