On Tue, 24 Feb 2004, Damien Hocking wrote:
> Two MPI processes on a two-processor machine will not run as fast as
> a true multithreaded application on a two-processor machine. A 61%
> speedup is pretty good.
This would most likely suggest memory contention = the processes are
doing memory intensive (and probably cache-unfriendly) computations
and both have to wait for the main memory to deliver data. While the
memory "speed" might have been enough for one process, it becomes the
bottleneck for two processes.
If this is the case, better scalability at low number of nodes can
usually be obtained by running only one process per node. At high
number of nodes, the network "speed" will become the bottleneck.
[ By "speed" in the above sentences, I wanted to cover all components
without entering into details. The beowulf list contains several
discussions about the details. ]
You can't call this distributed shared memory because it isn't :-)
Shared memory is one allocated memory area that is accesssed by
several processes. This might be how the underlying mechanism for MPI
_intranode_ process communication works, but certainly it is not how
the MPI _internode_ process communication works. As soon as a copying
of data is involved, there is no more shared memory.
--
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]
|