Hi ...
i have a problems with a program made with lam. the program copi a file
from nod0 to others 3 nods.
i dont have a problem when i run the program thus:
mpirun -np 4 mpi_copy /etc/hosts /home/mpi/hosts
this copy the local file /etc/hosts to /home/mpi/hosts in the nodes. i try
with large files and i not have a problem, but when i run the program
thus:
mpirun -np 4 mpi_copy /bin/ls /home/mpi/ls
the program crach and the error is this:
MPI_Recv: process in local group is dead (rank 2, MPI_COMM_WORLD)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (2, MPI_COMM_WORLD): - MPI_Recv()
MPI_Recv: process in local group is dead (rank 3, MPI_COMM_WORLD)
Rank (3, MPI_COMM_WORLD): Call stack within LAM:
Rank (3, MPI_COMM_WORLD): - MPI_Recv()
Rank (2, MPI_COMM_WORLD): - main()
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
PID 28682 failed on node n0 (192.168.1.200) due to signal 13.
-----------------------------------------------------------------------------
Rank (3, MPI_COMM_WORLD): - main()
and ... the read file fail at randon
the read function in master node is this:
---------------------------------------------------------------------------
void read_file(FILE *src_file, int cluster_size, int block_size)
{
unsigned int current_nodo, i, temp;
unsigned char c;
current_nodo=1;
i=0;
temp=0;
while (fread(&c, 1, 1, src_file)>0)
{
// send the data
MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1, MPI_COMM_WORLD);
if (i>=block_size )
{
current_nodo++;
i=0;
}
else
{
i++;
}
}
current_nodo=1;
while (current_nodo<cluster_size)
{
printf("Sending EOF to node: %i\n",current_nodo);
c=EOF;
MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1, MPI_COMM_WORLD);
current_nodo++;
}
}
Saludos From CHILE
Boris
-------
|