LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: bcruchet_at_[hidden]
Date: 2004-08-02 16:01:13


Hi ...

i have a problems with a program made with lam. the program copi a file
from nod0 to others 3 nods.

i dont have a problem when i run the program thus:

mpirun -np 4 mpi_copy /etc/hosts /home/mpi/hosts

this copy the local file /etc/hosts to /home/mpi/hosts in the nodes. i try
with large files and i not have a problem, but when i run the program
thus:

mpirun -np 4 mpi_copy /bin/ls /home/mpi/ls

the program crach and the error is this:

MPI_Recv: process in local group is dead (rank 2, MPI_COMM_WORLD)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (2, MPI_COMM_WORLD): - MPI_Recv()
MPI_Recv: process in local group is dead (rank 3, MPI_COMM_WORLD)
Rank (3, MPI_COMM_WORLD): Call stack within LAM:
Rank (3, MPI_COMM_WORLD): - MPI_Recv()
Rank (2, MPI_COMM_WORLD): - main()
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 28682 failed on node n0 (192.168.1.200) due to signal 13.
-----------------------------------------------------------------------------
Rank (3, MPI_COMM_WORLD): - main()

and ... the read file fail at randon

the read function in master node is this:

---------------------------------------------------------------------------
void read_file(FILE *src_file, int cluster_size, int block_size)
{
    unsigned int current_nodo, i, temp;
    unsigned char c;

    current_nodo=1;
    i=0;
    temp=0;

    while (fread(&c, 1, 1, src_file)>0)
    {
        // send the data
        MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1, MPI_COMM_WORLD);

        if (i>=block_size )
        {
            current_nodo++;
                i=0;
        }
        else
        {
            i++;
        }
    }
    current_nodo=1;
    while (current_nodo<cluster_size)
    {
        printf("Sending EOF to node: %i\n",current_nodo);
        c=EOF;
        MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1, MPI_COMM_WORLD);
        current_nodo++;
    }
}

Saludos From CHILE

Boris
-------