On Oct 28, 2005, at 7:21 AM, Sebastian Forsman wrote:
> Is there a limitation of maximum nodes in a LAM/MPI cluster (besides
> network
> bandwidth)?
> How about maximum nodes that can run a single job?
If you're using TCP for communication, you'll start running into
problems at both the file descriptor limit per process (sometimes 1024,
sometimes 2048 -- check your local settings) and the size of the OS C
struct FD_SET. Some OS's let you dynamically change the size of FD_SET
(like OS X) when LAM is compiled.
These restrictions are mainly because LAM opens sockets between all
peers in an MPI job during MPI_INIT (and the MPI-2 dynamic functions).
FWIW, Open MPI doesn't do this -- we only open sockets when they are
used.
--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/
|