LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: David Sanchez Rodriguez (dsanchez_at_[hidden])
Date: 2006-03-09 05:58:46


Hi all,

Up to now, I have been working with LAM-6.5.6 distribution. My
applications run right in it.
Last week, I updated to LAM-7.1.1 and I began to have problems with signals.

Next, I show a simple code to explain the problem:

-----------------------------------------
void unlockFixedSlave()
{
     printf("Recieved signal\n");
}

int main(argc,argv)
int argc;
char *argv[];
{
     int pid, NA, my_id;

     MPI_Init(&argc,&argv);
     MPI_Comm_rank(MPI_COMM_WORLD,&my_id);
     MPI_Comm_size(MPI_COMM_WORLD,&NA);
     NA--;

      pid= fork();
      switch(pid) {
             case 0: // SON
               {
             if (kinit(PRCMD) != 0) {
                 printf("Couldn't attach to the lamd. Bye!\n");
                 exit(1);
             }
             else
                 printf("I am attached to the lamd\n");
                 printf("SON PROCESS id= %d\n",getpid());
                 if (kdoom(getppid(),LAM_SIGA) == (ENOTPROCESS || EINVAL) )
                      printf("LAM_SIGA signal has NOT been sent\n");
                  else
                      printf("LAM_SIGA signal has been sent by son\n");
                 break;
             }
             default: // FATHER
             {
                 lam_ksignal(LAM_SIGA,unlockFixedSlave);
                 printf("After lam_ksignal\n");
                 while (1);
         }
       }
     MPI_Finalize();
}
-----------------------------------------

Father process installs a handler for LAM_SIGA (unlockFixedSlave), and
son process sent LAM_SIGA signal to father process using 'kdoom' function.
With LAM-6.5.6 the message "Recieved signal" (unlockFixedSlave function)
is show.
With LAM-7.1.1 the message is not show, but the signal was sent.
Therefore, I think the signal is not caught.

Could anybody say me what is the problem and how does it can be solve?

Next I show mpirun call and results for each distribution:

-WITH LAM 6.5.6

dsanchez_at_beltram:~/kk$ hcc -I/usr/src/lam-6.5.6/share/include
pruebasignal.c -o prueba
dsanchez_at_beltram:~/kk$ mpirun -v -np 1 prueba
756 prueba running on n0 (o)
After lam_ksignal
I am attached to the lamd
SON PROCESS id= 757
LAM_SIGA signal has been sent by son
Recieved signal

-WITH LAM 7.1.1

dsanchez_at_beltram:~/kk$ hcc -I/usr/src/lam-7.1.1/share/include
pruebasignal.c -o prueba
dsanchez_at_beltram:~/kk$ mpirun -v -sigs -np 1 prueba
769 prueba running on n0 (o)
After lam_ksignal
I am attached to the lamd
SON PROCESS id= 770
LAM_SIGA signal has been sent by son
dsanchez_at_beltram:~/kk$

Regards,
David

-- 
******************************
David Sánchez Rodríguez
Departamento de Ingeniería Telemática
Edif. Telecomunicaciones, Pab. C, Desp. 237
Campus Universitario de Tafira s/n
35017 - Las Palmas de Gran Canaria
SPAIN
TLF: 928-458047
FAX: 928-451380
E-Mail: dsanchez_at_[hidden]