LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-12-13 20:30:57


On Dec 12, 2005, at 9:28 AM, Douglas Vechinski wrote:

> Yes, the problem is present even when the master is on a node all
> to itself.
>
> No sceduler is present or being used.
>
> Yes, I can verify that the master is on a essentially idle node
> (i.e. no
> active time-consuming processes are running).

The next thing to check is your network -- ensure that it is actually
sending messages in a timely fashion, and not falling back to re-try
sending on timeouts and whatnot. This is definitely not my forte/
area -- others on this list may be able to chime in better than I.

Some network switches are known to handle congestion poorly, for
example, causing delays, etc.

--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/