On Thu, 11 Mar 2004, Wang Shuguang wrote:
> I wrote a small Fortran MPI program which reads/writes file on a NFS
> harddisk to exchange information between two processes. I have two same
> dual processors machines and I did two tests. first I tested the program
> on a single dual processors machine and ran two process on it. Then I
> tried the same program on two machines, and ran one process on each
> machine. the time that the program takes to run two process on a single
> machine is longer than that on two machines. is it normal? if not, what
> could be the reason? I suspect it has something to do with IO.
It is quite possible -- this is likely to have little to do with MPI. It
probably depends on your NFS implementation (which is notorious for not
having good scaling properties, but YMMV; I'm not an NFS expert).
Consider also the bandwidth requirements -- if you read/write on one node,
you're effectively doubling the bandwdith required for that node. So the
bottleneck is the single node (in many places; the network, the OS, the
NFS client, etc.) If you split that across two nodes, you're at least
splitting that across two ports on your switch (I'm assuming you have a
switch and not a hub), and your [assumedly] server-class NFS server may be
able to handle the simultaneous requests better. Hence, you at least take
on bottleneck out of the equation, and it seems that the other bottlenecks
(switch, bandwidth, server, etc.) allow better performance.
Without detailed knowledge of your setup, this is all hand-waving and
educated guesses, but it's probably somewhere close to reality. :-)
LAM does nothing with file IO -- so your READ and WRITE statements do not
get intercepted or processed by LAM at all. I suspect that if you run the
same experiment without LAM or MPI at all, you'll get the same results.
Hope that helps.
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|