On Thu, 13 Dec 2001, Martin Weinberg wrote:
> I have a relatively new MCMC code that shows good scaling only when I
> use -lamd. I see nearly 100% load per cpu using lamd and only 10% per
> cpu using c2c. MCMC is embarassing parallel so I would expect nearly
> linear scaling for quite a large N.
In general, c2c gives better performance than lamd. c2c communication
allows two processes to talk directly to each other, using either TCP or
shmem. The lamd transport requires the message be sent to the local lamd
over a unix domain socket. The lamd then talks to the remote lamd over a
UDP connection, who talks to the remote process over another unix domain
socket.
c2c generally gives better performance because the removal of the two
extra hops means that throughput is much higher. lamd mode, however,
often gives much better performance when non-blocking sends are being
used. Communication between the MPI process and the local lamd is
basically instantaneous, and communication between nodes can occur while
the MPI process is communicating.
My guess is that you are seeing the performance increase because the
application depends on a heavy overlap of computation and communication.
Brian
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|