MPI_Bcast is a collective operation - that is, all ranks in a
communicator (i.e. MPI_COMM_WORLD) must participate in the operation.
This is why your first example doesn't work - only rank 0 in
MPI_COMM_WORLD is calling MPI_Bcast, when all ranks should be calling it
(your second example).
In regards to your thought process your first example - how do you
expect the non-zero ranks to receive anything when they never indicate
to MPI that data should be received? Your aux variable is never going
to change from its initial value, thus the infinite loop.
If it helps, MPI_Bcast can be thought of in terms of send/recv: the
root rank (0 in your examples) does NP-1 sends, one to each of the other
ranks in the specified communicator. These NP-1 other ranks perform a
single MPI_Recv operation. The idea behind collectives is to make it
easy for the user to do more complex (but common) communication
patterns, and to allow the MPI to optimize them internally.
And if you really want to do the branch thing like you do for send/recv,
you could do this (though it's duplicated code):
if(rank == 0)
aux = (rank()/200000000);
printf ("New Aux = %d\n", aux);
MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
}else {
MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
}
Andrew
460853_at_[hidden] wrote:
> Hello everybody.
>
> I'm still on my MPI learning challenge :) Now i'm trying to learn to use the
> MPI_Broadcast instruction.
>
> I guess I must have a deep concept problem here. I'm trying to do a very simple
> program in which node 0 calculates random numbers in a variable called "aux"
> and broadcasts its value to all the nodes in the MPI World untill this "aux"
> number reaches 5. I've done this:
> ------------------------------- broad.c ------------------------------
> int main (int argc, char *argv[]){
> int rank, size, node, aux=0, Counter=0, request;
> MPI_Status status;
>
> MPI_Init (&argc, &argv); /* starts MPI */
> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
>
> srand (time(NULL));
>
> while (aux != 5){
> if (rank == 0){
> aux = (rand()/200000000);
> printf ("New Aux = %d\n", aux);
> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
> }
> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
> }
> printf ("Aux has reached %d in node %d\n", aux, rank);
> return MPI_Finalize ();
> }
> ---------------------------------------------------------------
>
> So I thought that when node 0 calculates a new "aux" value would send it to all
> the processes in the World (ehm... actually, only nodes 0 and 1) but it doesn't
> work. If I execute this, node 1 doesn't update its "aux" value and the system
> goes into an infinite loop, outputing:
>
> New Aux = 8
> I'm node 0, and I've got an 'aux' value of: 8
> New Aux = 6
> I'm node 0, and I've got an 'aux' value of: 6
> New Aux = 5
> I'm node 0, and I've got an 'aux' value of: 5
> Aux has reached 5 in node 0
> I'm node 1, and I've got an 'aux' value of: 0
> I'm node 1, and I've got an 'aux' value of: 0
> I'm node 1, and I've got an 'aux' value of: 0
> I'm node 1, and I've got an 'aux' value of: 0
> I'm node 1, and I've got an 'aux' value of: 0
> I'm node 1, and I've got an 'aux' value of: 0
>
> So it seems that node 0 calculates properly but then it's like if the
> "Broadcast" message isn't received in the node 1.
>
> I also tried to stop it somehow with an MPI_Barrier:
>
> ---------------------- barrier.c --------------------------
> void stop(){
> //sleep (1);
> MPI_Barrier (MPI_COMM_WORLD);
> }
>
> int main (int argc, char *argv[]){
> int rank, size, node, aux=0, Counter=0, request;
> MPI_Status status;
>
> MPI_Init (&argc, &argv); /* starts MPI */
> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
>
> srand (time(NULL));
>
> while (aux != 5){
> if (rank == 0){
> aux = (rand()/200000000);
> printf ("New Aux = %d\n", aux);
> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
> }
> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
> stop();
> }
> printf ("Aux has reached %d in node %d\n", aux, rank);
> return MPI_Finalize ();
> }
> ---------------------------------------------------
>
> But then I also have the same problem (aux in node 1 is not updated, and when
> aux in node 0 reaches 5, node 0 stops and then I've got the following error
>
> Aux has reached 5 in node 0
> I'm node 1, and I've got an 'aux' value of: 0
> MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD)
> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
> Rank (1, MPI_COMM_WORLD): - MPI_Recv()
> Rank (1, MPI_COMM_WORLD): - MPI_Barrier()
> Rank (1, MPI_COMM_WORLD): - main()
>
> The only way to solve it is putting the MPI_Broadcast outside the if (rank==0):
>
> ------------------------ bcast2.c-----------------------------
> int main (int argc, char *argv[]){
> int rank, size, node, aux=0, Counter=0, request;
> MPI_Status status;
>
> MPI_Init (&argc, &argv); /* starts MPI */
> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
>
> srand (time(NULL));
>
> while (aux != 5){
> if (rank == 0){
> aux = (rand()/200000000);
> printf ("New Aux = %d\n", aux);
> }
> MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
> }
> printf ("Aux has reached %d in node %d\n", aux, rank);
> return MPI_Finalize ();
> }
> ------------------------------------------------------
>
> Then yes, it works properly:
>
> New Aux = 9
> I'm node 0, and I've got an 'aux' value of: 9
> New Aux = 9
> I'm node 1, and I've got an 'aux' value of: 9
> I'm node 0, and I've got an 'aux' value of: 9
> New Aux = 0
> I'm node 1, and I've got an 'aux' value of: 9
> I'm node 0, and I've got an 'aux' value of: 0
> I'm node 1, and I've got an 'aux' value of: 0
> New Aux = 5
> I'm node 0, and I've got an 'aux' value of: 5
> I'm node 1, and I've got an 'aux' value of: 5
> Aux has reached 5 in node 0
> Aux has reached 5 in node 1
>
> What I would like to do with MPI_Broadcast would be something like what I can do
> with a Send/Receive that looks like this:
>
> ---------------------- bcastWithoutBcast.c ------------------------
> int main (int argc, char *argv[]){
> int rank, size, node, aux=0, Counter=0, request;
> MPI_Status status;
>
> MPI_Init (&argc, &argv); /* starts MPI */
> MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
> MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
>
> srand (time(NULL));
>
> while (aux != 5){
> if (rank == 0){
> aux = (rand()/200000000);
> printf ("New Aux = %d\n", aux);
> MPI_Send (&aux, 1, MPI_INT, 1, 100, MPI_COMM_WORLD);
> }else {
> MPI_Recv (&aux, 1, MPI_INT, 0, 100, MPI_COMM_WORLD, &status);
> }
> printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
> }
> printf ("Aux has reached %d in node %d\n", aux, rank);
> return MPI_Finalize ();
> }
> -----------------------------------------------------
>
> But if I try with MPI_Broadcast inside the if (rank==0) block, I can't, and I
> don't understand why. Is like if node 1 did not listen to the broadcast
> instruction...
>
> Thank you very much for the help you could give me
>
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|