LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: 460853_at_[hidden]
Date: 2006-11-23 11:44:35


Hello everybody.

I'm still on my MPI learning challenge :) Now i'm trying to learn to use the
MPI_Broadcast instruction.

I guess I must have a deep concept problem here. I'm trying to do a very simple
program in which node 0 calculates random numbers in a variable called "aux"
and broadcasts its value to all the nodes in the MPI World untill this "aux"
number reaches 5. I've done this:
------------------------------- broad.c ------------------------------
int main (int argc, char *argv[]){
        int rank, size, node, aux=0, Counter=0, request;
        MPI_Status status;

        MPI_Init (&argc, &argv); /* starts MPI */
        MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
        MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */

        srand (time(NULL));

        while (aux != 5){
                if (rank == 0){
                   aux = (rand()/200000000);
                   printf ("New Aux = %d\n", aux);
                   MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
                }
                printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
        }
        printf ("Aux has reached %d in node %d\n", aux, rank);
        return MPI_Finalize ();
}
---------------------------------------------------------------

So I thought that when node 0 calculates a new "aux" value would send it to all
the processes in the World (ehm... actually, only nodes 0 and 1) but it doesn't
work. If I execute this, node 1 doesn't update its "aux" value and the system
goes into an infinite loop, outputing:

New Aux = 8
I'm node 0, and I've got an 'aux' value of: 8
New Aux = 6
I'm node 0, and I've got an 'aux' value of: 6
New Aux = 5
I'm node 0, and I've got an 'aux' value of: 5
Aux has reached 5 in node 0
I'm node 1, and I've got an 'aux' value of: 0
I'm node 1, and I've got an 'aux' value of: 0
I'm node 1, and I've got an 'aux' value of: 0
I'm node 1, and I've got an 'aux' value of: 0
I'm node 1, and I've got an 'aux' value of: 0
I'm node 1, and I've got an 'aux' value of: 0

So it seems that node 0 calculates properly but then it's like if the
"Broadcast" message isn't received in the node 1.

I also tried to stop it somehow with an MPI_Barrier:

---------------------- barrier.c --------------------------
void stop(){
        //sleep (1);
        MPI_Barrier (MPI_COMM_WORLD);
}

int main (int argc, char *argv[]){
        int rank, size, node, aux=0, Counter=0, request;
        MPI_Status status;

        MPI_Init (&argc, &argv); /* starts MPI */
        MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
        MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */

        srand (time(NULL));

        while (aux != 5){
                if (rank == 0){
                        aux = (rand()/200000000);
                        printf ("New Aux = %d\n", aux);
                        MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
                }
                printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
                stop();
        }
        printf ("Aux has reached %d in node %d\n", aux, rank);
        return MPI_Finalize ();
}
---------------------------------------------------

But then I also have the same problem (aux in node 1 is not updated, and when
aux in node 0 reaches 5, node 0 stops and then I've got the following error

Aux has reached 5 in node 0
I'm node 1, and I've got an 'aux' value of: 0
MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Recv()
Rank (1, MPI_COMM_WORLD): - MPI_Barrier()
Rank (1, MPI_COMM_WORLD): - main()

The only way to solve it is putting the MPI_Broadcast outside the if (rank==0):

------------------------ bcast2.c-----------------------------
int main (int argc, char *argv[]){
        int rank, size, node, aux=0, Counter=0, request;
        MPI_Status status;

        MPI_Init (&argc, &argv); /* starts MPI */
        MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
        MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */

        srand (time(NULL));

        while (aux != 5){
                if (rank == 0){
                        aux = (rand()/200000000);
                        printf ("New Aux = %d\n", aux);
                }
                MPI_Bcast (&aux, 1, MPI_INT, 0, MPI_COMM_WORLD);
                printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
        }
        printf ("Aux has reached %d in node %d\n", aux, rank);
        return MPI_Finalize ();
}
------------------------------------------------------

Then yes, it works properly:

New Aux = 9
I'm node 0, and I've got an 'aux' value of: 9
New Aux = 9
I'm node 1, and I've got an 'aux' value of: 9
I'm node 0, and I've got an 'aux' value of: 9
New Aux = 0
I'm node 1, and I've got an 'aux' value of: 9
I'm node 0, and I've got an 'aux' value of: 0
I'm node 1, and I've got an 'aux' value of: 0
New Aux = 5
I'm node 0, and I've got an 'aux' value of: 5
I'm node 1, and I've got an 'aux' value of: 5
Aux has reached 5 in node 0
Aux has reached 5 in node 1

What I would like to do with MPI_Broadcast would be something like what I can do
 with a Send/Receive that looks like this:

---------------------- bcastWithoutBcast.c ------------------------
int main (int argc, char *argv[]){
        int rank, size, node, aux=0, Counter=0, request;
        MPI_Status status;

        MPI_Init (&argc, &argv); /* starts MPI */
        MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
        MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */

        srand (time(NULL));

        while (aux != 5){
                if (rank == 0){
                        aux = (rand()/200000000);
                        printf ("New Aux = %d\n", aux);
                        MPI_Send (&aux, 1, MPI_INT, 1, 100, MPI_COMM_WORLD);
                }else {
                        MPI_Recv (&aux, 1, MPI_INT, 0, 100, MPI_COMM_WORLD, &status);
                }
                printf ("I'm node %d, and I've got an \'aux\' value of: %d\n", rank, aux);
        }
        printf ("Aux has reached %d in node %d\n", aux, rank);
        return MPI_Finalize ();
}
-----------------------------------------------------

But if I try with MPI_Broadcast inside the if (rank==0) block, I can't, and I
don't understand why. Is like if node 1 did not listen to the broadcast
instruction...

Thank you very much for the help you could give me