LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Ganesh Iyer (ftpvnit_at_[hidden])
Date: 2004-08-03 00:41:00


hi,
I had same problem with lam, signal error happend with
on specific system of following configuration,
Intel 3.04 GHz - HT, SCSI hard disk.
does your system on .200 have same kind of
configuration.
anyway I was not able to solve my problem

ganesh.
India.

--- bcruchet_at_[hidden] wrote:

> Hi ...
>
> i have a problems with a program made with lam. the
> program copi a file
> from nod0 to others 3 nods.
>
> i dont have a problem when i run the program thus:
>
> mpirun -np 4 mpi_copy /etc/hosts /home/mpi/hosts
>
> this copy the local file /etc/hosts to
> /home/mpi/hosts in the nodes. i try
> with large files and i not have a problem, but when
> i run the program
> thus:
>
>
> mpirun -np 4 mpi_copy /bin/ls /home/mpi/ls
>
> the program crach and the error is this:
>
> MPI_Recv: process in local group is dead (rank 2,
> MPI_COMM_WORLD)
> Rank (2, MPI_COMM_WORLD): Call stack within LAM:
> Rank (2, MPI_COMM_WORLD): - MPI_Recv()
> MPI_Recv: process in local group is dead (rank 3,
> MPI_COMM_WORLD)
> Rank (3, MPI_COMM_WORLD): Call stack within LAM:
> Rank (3, MPI_COMM_WORLD): - MPI_Recv()
> Rank (2, MPI_COMM_WORLD): - main()
>
-----------------------------------------------------------------------------
> One of the processes started by mpirun has exited
> with a nonzero exit
> code. This typically indicates that the process
> finished in error.
> If your process did not finish in error, be sure to
> include a "return
> 0" or "exit(0)" in your C code before exiting the
> application.
>
> PID 28682 failed on node n0 (192.168.1.200) due to
> signal 13.
>
-----------------------------------------------------------------------------
> Rank (3, MPI_COMM_WORLD): - main()
>
>
> and ... the read file fail at randon
>
>
> the read function in master node is this:
>
>
---------------------------------------------------------------------------
> void read_file(FILE *src_file, int cluster_size, int
> block_size)
> {
> unsigned int current_nodo, i, temp;
> unsigned char c;
>
> current_nodo=1;
> i=0;
> temp=0;
>
> while (fread(&c, 1, 1, src_file)>0)
> {
> // send the data
> MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1,
> MPI_COMM_WORLD);
>
> if (i>=block_size )
> {
> current_nodo++;
> i=0;
> }
> else
> {
> i++;
> }
> }
> current_nodo=1;
> while (current_nodo<cluster_size)
> {
> printf("Sending EOF to node:
> %i\n",current_nodo);
> c=EOF;
> MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1,
> MPI_COMM_WORLD);
> current_nodo++;
> }
> }
>
>
>
> Saludos From CHILE
>
> Boris
> -------
> _______________________________________________
> This list is archived at
> http://www.lam-mpi.org/MailArchives/lam/
>

                
__________________________________
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail