hi all,
It's some problem while using FFTW in LAM/MPI. I have written a
program to test the efficiency.
for (idx=0; idx<5000; idx++)
{
fftw_mpi(forward_plan, 4096, mydata, outdata);
if (rank==0)
{
// processing outdata here
}
fftw_mpi(backward_plan, 4096, outdata, mydata);
}
I found that the efficiency of the above code, running in a 8-node
parallel network, is not better than that running in a single machine !!!
I wonder if anything wrong with my code or FFTW in MPI is not as good as
we expect.
|