On Wed, 10 Sep 2003, Lei_at_UCI wrote:
> When multiple Linux/Unix processes open the same File (on NFS) for read,
> it should work. Is the Input already working in parallel?
It depends on what you mean by "in parallel". Let's consider a small
sample program:
-----
#include <stdio.h>
#include <mpi.h>
int main(int argc, char* argv[]) {
FILE *fp;
int data;
MPI_Init(&argc, &argv);
/* Assume that this file is located in an NFS tree */
fp = fopen(fp, "/nfs/some/file/name.txt");
fscanf(fp, "%d", &data);
fclose(fp);
MPI_Finalize();
return 0;
}
-----
Assume a perfect system where this program runs in lock step on all
available nodes/CPUs.
Also keep in mind here that it's not just NFS that is the issue -- it's
any networked filesystem. I'll just call it "NFS" here for convenience.
In this example, all processes will attempt to open name.txt at the same
time. This will trigger the NFS code in the kernel to "reach out and
touch" the NFS server and request the file name.txt. Depending on how
many processes are running in your parallel application, this could turn
out to be a real bottleneck because the server must service all the
requests in some serial ordering (or perhaps slightly parallel if there
are multiple CPUs available on the CPU; but in general, you can probably
assume that the number of MPI processes requesting data from the NFS
server are going to be larger than the number of CPUs available on the NFS
server).
So although the MPI processes are running in parallel, you have a serial
bottleneck in the NFS server. This can be helped with NFS caching and
file staging and whatnot, but that would need to be done carefully,
probably before running the MPI process.
The big factors here are how many MPI processes are involved and how much
data they each need to read. For example, if you've only got 8 MPI
processes and each of them reads a few kilobytes, the NFS bottleneck is
likely to be no big deal. But if each process needs to read 100MB of
data, and/or you've got 256 processes, the NFS bottleneck becomes an
issue.
There are a variety of solutions that people typically use for this kind
of problem which usually involve some form of file staging (i.e., copying
the data down to a local directory at the beginning of the job) or a
parallel filesystem (there are several available -- google for them).
> Will MPI-IO make this work faster?
LAM/MPI includes the ROMIO package from Argonne National Labs which
provides an efficient implementation of the MPI-2 IO API. Check through
the ROMIO documentation for more details on its implementation and usage.
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|