LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Kumar, Ravi Ranjan (rrkuma0_at_[hidden])
Date: 2005-03-22 14:08:13


Hello,

Thanks a lot for your replies!

Since I am a newbie to MPI, I might have understood something wrong that is why
I am not getting expected results. Let me give a clear picture of what I want
from my MPI code. This might help you in pointing out my mistakes and
clarifying my misconceptions about implementing MPI.

PROBLEM DESCRIPTION (You might skip description & see pseudocode): I want to
compute for T[Nz][Nx][Ny] = T[101][101][101] points ( using
3D array since the physical structure is a cuboid). The problem has Nz number
of planes with each plane containing Nx*Ny numbers of data points. To
paralellize this problem, I divided the Nz into N slices and assigned
each node to work on a slice. Thus number of slices = number of nodes. Each
slice has r number of planes and each plane has Nx*Ny = 101x101 numbers of
data points. Thus r x N = Nz. Each point has 6 neighbouring points (north,
east, west, south, top & bottom). To compute result pointwise, each point needs
data from 6 neighboring points. Points at the end planes of a slice needs data
from neighbouring processors. For this, interface data exchange is required
between adjacent nodes.

I want to compute same thing at different time levels. Once all the nodes
complete their task at a particular time level, things will be ready for next
time level. New time level requires data from old time level. Basically, we
have to solve system of linear equations AT=F at each time level.

Below is the pseudocode for my problem:

------------------------------------------------------------------
for(time=1; time<=Nt; time++) //TIME LOOP BEGINS HERE

{

calculate_F();

do.while(GlobalMaxErr > tolerance_limit) //COMPUTE FOR NEW T UNLESS CONDITION
IS MET
   {

    exchange_interface_data_T(...exchanges interfacial planes of data before
computation starts....);

    Red_SOR(...computes for points on odd numbered planes...);

    Black_SOR(...computes for points on even numbered planes...);

    find_Global_max_error(...finds maximum change in data point values as
comapred to corresponding old value....)

   }

update_some_values(....);

update_old_T(...);

MPI_Barrier(MPI_COMM_WORLD); // TO MAKE SURE ALL NODES COMPLETED THEIR TASK

} // TIME LOOP ENDS HERE

---------------------------------------------------------

find_global_max_error(..) finds out maximum among maximas of each node. To
achieve this, first I assumed node0 has Globalmaximum error. Then I send
local_maximum_error from each node to node0. If local_max_error from a node
happnes to be greater than assumed Globalmaximum of node0, local becomes global
otherwise no change. Once all nodes send their local to node0 and global is
finalised, node0 broadcast it and tells other nodes that is has
global_max_error. This is what I wanted to achieve.

Is there any mistake in order of receiving data from each node and broadcasting?
Can not I implement this the way I did? However, my code runs for 51x51x51 size
array but hangs for 101x101x101 size array *after few time steps*.

Kindly help me out!

Thanks!
Ravi R. Kumar

Quoting Brian Barrett <brbarret_at_[hidden]>:

> On Mar 22, 2005, at 9:04 AM, David Cronk wrote:
>
> > Neil Storer wrote:
> >> Ravi,
> >> You seem to be sending 1 MPI_DOUBLE value from the MAIN program of
> >> each of your tasks (except rank0) to rank0, but only doing a single
> >> receive in rank0. This will leave the remaining MPI_DOUBLE-size
> >> messages in the buffer. The MPI_Bcast on rank0 will get one of these
> >> buffered messages and you are now totally out of step with your
> >> SEND/REVCs.
> >
> > I am not a LAM developer and I have not looked at the LAM source code,
> > but I doubt very much that LAM does this. This would be in clear
> > violation to the standard. Something like this may lead to deadlock,
> > but a Bcast should never receive data sent with a send.
>
> David is correct on both counts. Data sent using the collective
> algorithms is completely separate from data sent by the point-to-point
> functions. So there is no chance of data sent with an MPI_Send being
> received as part of an MPI_Bcast. This would be a blatant violation of
> the MPI standard, and would break a number of applications. There may
> be places where LAM interprets the standard differently than other MPI
> implementations, but this is definitely not one of them.
>
> > Having said that, I can't see how it would ever work (with your
> > smaller problem size) and why the failure appears to be in the MAIN
> > program. Unless you haven't shown us the full MPI code in the MAIN
> > program.
>
> I do want to point out that Neil is probably right here - there is
> something with the communication patterns of Ravi's code that is
> causing messages to arrive in an order he doesn't expect. Usually
> using MPE or XMPI and writing out the communication patterns is a good
> way to work through what's happening.
>
> Hope this helps,
>
> Brian
>
> --
> Brian Barrett
> LAM/MPI developer and all around nice guy
> Have a LAM/MPI day: http://www.lam-mpi.org/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>