Hi,
I assigned multiple processes on one machine instead of several machines. In this way, I expect the communication cost will be reduced compared with assigning them to several machines. However, the result is weird. both computation cost and communication cost are increased sharply, especially MPI_Reduce, MPI_Sendrecv. It seems that the underlying implementation of lam_mpi doesn't favor multiple processes on the same host.
Your help will be greatly appreciated.
thanks
|