LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: bcruchet_at_[hidden]
Date: 2004-08-03 03:06:31


my hardware configuration

master node:
P IV 2.0
512 MB Ram
HD 40G IDE (i belive it's seagate)
Intel Main Board

Boris
Chile
> hi,
> I had same problem with lam, signal error happend with
> on specific system of following configuration,
> Intel 3.04 GHz - HT, SCSI hard disk.
> does your system on .200 have same kind of
> configuration.
> anyway I was not able to solve my problem
>
> ganesh.
> India.
>
> --- bcruchet_at_[hidden] wrote:
>
>> Hi ...
>>
>> i have a problems with a program made with lam. the
>> program copi a file
>> from nod0 to others 3 nods.
>>
>> i dont have a problem when i run the program thus:
>>
>> mpirun -np 4 mpi_copy /etc/hosts /home/mpi/hosts
>>
>> this copy the local file /etc/hosts to
>> /home/mpi/hosts in the nodes. i try
>> with large files and i not have a problem, but when
>> i run the program
>> thus:
>>
>>
>> mpirun -np 4 mpi_copy /bin/ls /home/mpi/ls
>>
>> the program crach and the error is this:
>>
>> MPI_Recv: process in local group is dead (rank 2,
>> MPI_COMM_WORLD)
>> Rank (2, MPI_COMM_WORLD): Call stack within LAM:
>> Rank (2, MPI_COMM_WORLD): - MPI_Recv()
>> MPI_Recv: process in local group is dead (rank 3,
>> MPI_COMM_WORLD)
>> Rank (3, MPI_COMM_WORLD): Call stack within LAM:
>> Rank (3, MPI_COMM_WORLD): - MPI_Recv()
>> Rank (2, MPI_COMM_WORLD): - main()
>>
> -----------------------------------------------------------------------------
>> One of the processes started by mpirun has exited
>> with a nonzero exit
>> code. This typically indicates that the process
>> finished in error.
>> If your process did not finish in error, be sure to
>> include a "return
>> 0" or "exit(0)" in your C code before exiting the
>> application.
>>
>> PID 28682 failed on node n0 (192.168.1.200) due to
>> signal 13.
>>
> -----------------------------------------------------------------------------
>> Rank (3, MPI_COMM_WORLD): - main()
>>
>>
>> and ... the read file fail at randon
>>
>>
>> the read function in master node is this:
>>
>>
> ---------------------------------------------------------------------------
>> void read_file(FILE *src_file, int cluster_size, int
>> block_size)
>> {
>> unsigned int current_nodo, i, temp;
>> unsigned char c;
>>
>> current_nodo=1;
>> i=0;
>> temp=0;
>>
>> while (fread(&c, 1, 1, src_file)>0)
>> {
>> // send the data
>> MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1,
>> MPI_COMM_WORLD);
>>
>> if (i>=block_size )
>> {
>> current_nodo++;
>> i=0;
>> }
>> else
>> {
>> i++;
>> }
>> }
>> current_nodo=1;
>> while (current_nodo<cluster_size)
>> {
>> printf("Sending EOF to node:
>> %i\n",current_nodo);
>> c=EOF;
>> MPI_Send(&c, 1, MPI_CHAR, current_nodo, 1,
>> MPI_COMM_WORLD);
>> current_nodo++;
>> }
>> }
>>
>>
>>
>> Saludos From CHILE
>>
>> Boris
>> -------
>> _______________________________________________
>> This list is archived at
>> http://www.lam-mpi.org/MailArchives/lam/
>>
>
>
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Mail - Helps protect you from nasty viruses.
> http://promotions.yahoo.com/new_mail
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>