LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2007-02-24 14:43:51


On Feb 21, 2007, at 1:44 AM, Ramon Diaz-Uriarte wrote:

> On 2/20/07, Brian Barrett <brbarret_at_[hidden]> wrote:
>> This is a problem with the LAM daemon and internal resource
>> limitations. The LAM daemons were not intended to be used to run
>> that many processes on one node. Unfortunately, this will not be
>> fixed in LAM as it would require a large number of changes and we are
>> currently focusing all our development work on Open MPI. The best I
>> can suggest is to not spawn as many processes per node.
>
> Thanks for the reply. At least this will put a stop to my endless
> search for "what did I screw up".
>
> Two questions, though:
>
> 1. am I less likely to run into these problems if I switch to Open
> MPI?
>
> 2. Would having many lamd per node (thus, with a lot fewer slaves per
> demon) help ?

You are less likely to run into this particular problem with Open MPI
than with LAM. However, if you are using MPI_COMM_SPAWN to start the
processes, I would warn you that Open MPI's support for spawning is
not nearly as stable as that found in LAM/MPI.

Running more than one lamd per node is only possible if the daemons
are in different "universes". This should allow you to run more
processes per node, but keep in mind that a process in one universe
can not communicate with a process in another universe.

Brian

-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have a LAM/MPI day: http://www.lam-mpi.org/