LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Hector (460853_at_[hidden])
Date: 2006-11-26 19:25:01


Mm... yeah, but the thing is that I had read that the MPI_Broadcast made a
copy in the same region of memory, so I understood that when a Broadcast was
sent, all the processes would "hear" it without the need of any kind of
"Receive". I didn't know that the "Broadcast" also was used for receiving.

Thank you very much for clarifying!!

----- Original Message -----
From: "Andrew Friedley" <afriedle_at_[hidden]>
To: "General LAM/MPI mailing list" <lam_at_[hidden]>
Sent: Sunday, November 26, 2006 7:05 PM
Subject: Re: LAM: MPI_Broadcast misunderstanding

> MPI_Bcast is a collective operation - that is, all ranks in a
> communicator (i.e. MPI_COMM_WORLD) must participate in the operation.
> This is why your first example doesn't work - only rank 0 in
> MPI_COMM_WORLD is calling MPI_Bcast, when all ranks should be calling it
> (your second example).
>
> In regards to your thought process your first example - how do you
> expect the non-zero ranks to receive anything when they never indicate
> to MPI that data should be received? Your aux variable is never going
> to change from its initial value, thus the infinite loop.
>
> If it helps, MPI_Bcast can be thought of in terms of send/recv: the
> root rank (0 in your examples) does NP-1 sends, one to each of the other
> ranks in the specified communicator. These NP-1 other ranks perform a
> single MPI_Recv operation. The idea behind collectives is to make it
> easy for the user to do more complex (but common) communication
> patterns, and to allow the MPI to optimize them internally.
>
> And if you really want to do the branch thing like you do for send/recv,
> you could do this (though it's duplicated code):
>
> if(rank == 0)
> aux = (rank()/200000000);
> printf ("New Aux = %d\n", aux);
> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
> }else {
> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
> }
>
>
> Andrew
>
>
> 460853_at_[hidden] wrote:
>> Hello everybody.
>>
>> I'm still on my MPI learning challenge :) Now i'm trying to learn to use
>> the
>> MPI_Broadcast instruction.
>>
>> I guess I must have a deep concept problem here. I'm trying to do a very
>> simple
>> program in which node 0 calculates random numbers in a variable called
>> "aux"
>> and broadcasts its value to all the nodes in the MPI World untill this
>> "aux"
>> number reaches 5. I've done this:
>> ------------------------------- broad.c ------------------------------
>> int main (int argc, char *argv[]){
>> int rank, size, node, aux=0, Counter=0, request;
>> MPI_Status status;
>>
>> MPI_Init (&argc, &argv); /* starts MPI */
>> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id
>> */
>> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes
>> */
>>
>> srand (time(NULL));
>>
>> while (aux != 5){
>> if (rank == 0){
>> aux = (rand()/200000000);
>> printf ("New Aux = %d\n", aux);
>> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
>> }
>> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank,
>> aux);
>> }
>> printf ("Aux has reached %d in node %d\n", aux, rank);
>> return MPI_Finalize ();
>> }
>> ---------------------------------------------------------------
>>
>> So I thought that when node 0 calculates a new "aux" value would send it
>> to all
>> the processes in the World (ehm... actually, only nodes 0 and 1) but it
>> doesn't
>> work. If I execute this, node 1 doesn't update its "aux" value and the
>> system
>> goes into an infinite loop, outputing:
>>
>> New Aux = 8
>> I'm node 0, and I've got an 'aux' value of: 8
>> New Aux = 6
>> I'm node 0, and I've got an 'aux' value of: 6
>> New Aux = 5
>> I'm node 0, and I've got an 'aux' value of: 5
>> Aux has reached 5 in node 0
>> I'm node 1, and I've got an 'aux' value of: 0
>> I'm node 1, and I've got an 'aux' value of: 0
>> I'm node 1, and I've got an 'aux' value of: 0
>> I'm node 1, and I've got an 'aux' value of: 0
>> I'm node 1, and I've got an 'aux' value of: 0
>> I'm node 1, and I've got an 'aux' value of: 0
>>
>> So it seems that node 0 calculates properly but then it's like if the
>> "Broadcast" message isn't received in the node 1.
>>
>> I also tried to stop it somehow with an MPI_Barrier:
>>
>> ---------------------- barrier.c --------------------------
>> void stop(){
>> //sleep (1);
>> MPI_Barrier (MPI_COMM_WORLD);
>> }
>>
>> int main (int argc, char *argv[]){
>> int rank, size, node, aux=0, Counter=0, request;
>> MPI_Status status;
>>
>> MPI_Init (&argc, &argv); /* starts MPI */
>> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id
>> */
>> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes
>> */
>>
>> srand (time(NULL));
>>
>> while (aux != 5){
>> if (rank == 0){
>> aux = (rand()/200000000);
>> printf ("New Aux = %d\n", aux);
>> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
>> }
>> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank,
>> aux);
>> stop();
>> }
>> printf ("Aux has reached %d in node %d\n", aux, rank);
>> return MPI_Finalize ();
>> }
>> ---------------------------------------------------
>>
>> But then I also have the same problem (aux in node 1 is not updated, and
>> when
>> aux in node 0 reaches 5, node 0 stops and then I've got the following
>> error
>>
>> Aux has reached 5 in node 0
>> I'm node 1, and I've got an 'aux' value of: 0
>> MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD)
>> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
>> Rank (1, MPI_COMM_WORLD): - MPI_Recv()
>> Rank (1, MPI_COMM_WORLD): - MPI_Barrier()
>> Rank (1, MPI_COMM_WORLD): - main()
>>
>> The only way to solve it is putting the MPI_Broadcast outside the if
>> (rank==0):
>>
>> ------------------------ bcast2.c-----------------------------
>> int main (int argc, char *argv[]){
>> int rank, size, node, aux=0, Counter=0, request;
>> MPI_Status status;
>>
>> MPI_Init (&argc, &argv); /* starts MPI */
>> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id
>> */
>> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes
>> */
>>
>> srand (time(NULL));
>>
>> while (aux != 5){
>> if (rank == 0){
>> aux = (rand()/200000000);
>> printf ("New Aux = %d\n", aux);
>> }
>> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
>> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank,
>> aux);
>> }
>> printf ("Aux has reached %d in node %d\n", aux, rank);
>> return MPI_Finalize ();
>> }
>> ------------------------------------------------------
>>
>> Then yes, it works properly:
>>
>> New Aux = 9
>> I'm node 0, and I've got an 'aux' value of: 9
>> New Aux = 9
>> I'm node 1, and I've got an 'aux' value of: 9
>> I'm node 0, and I've got an 'aux' value of: 9
>> New Aux = 0
>> I'm node 1, and I've got an 'aux' value of: 9
>> I'm node 0, and I've got an 'aux' value of: 0
>> I'm node 1, and I've got an 'aux' value of: 0
>> New Aux = 5
>> I'm node 0, and I've got an 'aux' value of: 5
>> I'm node 1, and I've got an 'aux' value of: 5
>> Aux has reached 5 in node 0
>> Aux has reached 5 in node 1
>>
>> What I would like to do with MPI_Broadcast would be something like what I
>> can do
>> with a Send/Receive that looks like this:
>>
>> ---------------------- bcastWithoutBcast.c ------------------------
>> int main (int argc, char *argv[]){
>> int rank, size, node, aux=0, Counter=0, request;
>> MPI_Status status;
>>
>> MPI_Init (&argc, &argv); /* starts MPI */
>> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id
>> */
>> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes
>> */
>>
>> srand (time(NULL));
>>
>> while (aux != 5){
>> if (rank == 0){
>> aux = (rand()/200000000);
>> printf ("New Aux = %d\n", aux);
>> MPI_Send (&aux, 1, MPI_INT, 1, 100, MPI_COMM_WORLD);
>> }else {
>> MPI_Recv (&aux, 1, MPI_INT, 0, 100, MPI_COMM_WORLD, &status);
>> }
>> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank,
>> aux);
>> }
>> printf ("Aux has reached %d in node %d\n", aux, rank);
>> return MPI_Finalize ();
>> }
>> -----------------------------------------------------
>>
>> But if I try with MPI_Broadcast inside the if (rank==0) block, I can't,
>> and I
>> don't understand why. Is like if node 1 did not listen to the broadcast
>> instruction...
>>
>> Thank you very much for the help you could give me
>>
>>
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>