LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: abhik.sarkar_at_[hidden]
Date: 2003-07-17 05:05:51


     Hi All,
                
                I am Abhik Sarkar.Currently am working on a beowulf cluster
     with Lam-6.5.6 running.For a while i had been trying to use the MPI
     using threads such that the main thread implements certain decoding
     operation and as in when it requires a data from other process, it
     prompts the child thread(POSIX) to do the communication.A simple code
     implements the scenario where 2 processes on a single node generate
     their corresponding threads with one of the processes through its
     thread does the MPI_Send and the second MPI_Probe followed by
     MPI_Recv.All calls to the LAM has been from the child
     threads.Apparently, MPI_Send() is working fine but MPI_Recv() is
     giving problem.
     
     The following is the code.
     --------------------------
     
     #include<stdio.h>
     #include<pthread.h>
     #include<mpi.h>
     #include<curses.h>
     #include<string.h>
     #include<unistd.h>
     
     char c;
     int rank;
     struct procnum{
               int *argc1;
                 char ***argv1;
              };
     
     void * thrdprobe(void *t)
     { int len;
         char *s;
         MPI_Status stat;
         int ret;
         MPI_Init(((struct procnum *)t)->argc1,((struct procnum *)t)->argv1);
         MPI_Comm_rank(MPI_COMM_WORLD,&rank);
         
         printf("my rank is %d\n",rank);
         if(rank==0)
         {
                  s="abhik";
                  len=strlen(s);
                  MPI_Send(s,len+1,MPI_CHAR,1,17,MPI_COMM_WORLD);
                  printf("sent\n");
          }
         else
         {
                 MPI_Probe(MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,&stat);
                 printf("the return stat is done\n");
                 MPI_Recv(s,10,MPI_CHAR,0,17,MPI_COMM_WORLD,&stat);
             
                 printf("my name is %s\n",s);
           }
           MPI_Barrier(MPI_COMM_WORLD);
           printf("barrier passed\n");
           MPI_Finalize();
           return(0);
     }
     
     int main(int argc,char *argv[])
     {
            pthread_t thr_id;
            struct procnum s;
            void *thrd_stat;
            s.argc1=&argc;
            s.argv1=&argv;
     
            if(!pthread_create(&thr_id,NULL,thrdprobe,(void *)&s))
     
                       printf("created a thread\n");
            else
                       perror("pthread_create\n");
            pthread_join(thr_id,&thrd_stat);
            printf("killing thread\n");
            return(0);
     }
     
     
     
     
     The following is the output with errors on standard I/O.
     ---------------------------------------------------------
     
      created a thread
     created a thread
     my rank is 0
     sent
     my rank is 1
     the return stat is done
     Rank (1, MPI_COMM_WORLD): Call stack within LAM:
     Rank (1, MPI_COMM_WORLD): - MPI_Recv()
     Rank (1, MPI_COMM_WORLD): - main()
     MPI process rank 1 (n0, p1193) caught a SIGSEGV in MPI_Recv.
     ---------------------------------------------------------------------------
     --
     
     One of the processes started by mpirun has exited with a nonzero exit
     code. This typically indicates that the process finished in error.
     If your process did not finish in error, be sure to include a "return
     0" or "exit(0)" in your C code before exiting the application.
     PID 1188 failed on node n0 with exit status 1.
     ---------------------------------------------------------------------------
     --
     
     we urgently need help.....and plz reply with respect to lam-6.5.6 only.
     
     with Regards