LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Sriram Sankaran (ssankara_at_[hidden])
Date: 2003-07-11 12:24:00


Hi,

I'm not sure what you mean by saying that the program is ok only until the
17th step. If one or more processes died during execution, check for
common sources of error like mismatched sends/receives in your code. If
you are seeing memory-related problems, you might want to try running your
application through a memory-checking debugger (purify, Solaris' bcheck,
Linux's valgrind, etc.). Check the LAM FAQ under "Debugging MPI Programs
Under LAM/MPI" for how to run MPI programs through debuggers -- we find
such tools to be invaluable both in developing LAM and writing MPI
applications.

As for limits to MPI_Send, there only limits are those imposed by the OS.
It is unlikely that this could be causing your problem.

Hope this helps.

-- 
Sriram Sankaran
email: ssankara_at_[hidden]
http://www.lam-mpi.org/
Thus spake Gilda Lucia Bakker Batista de Menezes, on Jul 10:
>Date: Thu, 10 Jul 2003 18:36:24 -0300
>From: Gilda Lucia Bakker Batista de Menezes <gilda_at_[hidden]>
>Reply-To: General LAM/MPI mailing list <lam_at_[hidden]>
>To: lam_at_[hidden]
>Cc: osl-mailman_at_[hidden]
>Subject: LAM: SEND limit
>
>To MPI-Forum
>
>Please, help me.
>
>I have had a problem with a program that I am developing.
>I need to send some coordinates of points to some processes, and the number
>of points must increase in each steep.
>But the program is OK only until the 17th  steep.
>Are there any limit to the MPI SEND routine ? I use: a beawoulf, Linux, C
>language, MPI-LAM.
>What can I do to solve my problem ?
>
>Thanks,
>G.Menezes.
>--------------------------------------------------
>This is the program:
>
>/* vortex.c */
>
>#include "mpi.h"
>#include <stdio.h>
>#include <math.h>
>#include <stdlib.h>
>
>#define MASTER 0
>#define FROM_MASTER 1
>#define FROM_WORKER 2
>#define MR 3 //numbers of points
>
>main(int argc,char *argv[])
>{
>int Rrows,Roffset,f,numtasks, taskid, numworkers, source, dest, mtype, rows,
>averow, extra, offset, i, j, k, rc,h,jj;
>int np=MR, nv=MR, alto,  alfa;
>MPI_Status status;
>
>double dx, dy, eps=0.045;
>double ze[MR][2], ze1[MR][2], ze2[MR][2], zc[MR][2], zg[MR][2], n[MR][2], dels
>[3];
>
>double *zv,*zvtemp,*zz;
>
>rc=MPI_Init(&argc,&argv);
>rc=MPI_Comm_size(MPI_COMM_WORLD,&numtasks);
>rc=MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
>
>if (rc != 0)
>  printf ("Error!!\n");
>  numworkers = numtasks-1;
>
>//begining
>alto = 15;
>for(alfa=0; alfa<alto; alfa++)
>{
>
>  printf(".................... \n ->rank:%d alfa:%d  nv:%d\n ",taskid,alfa,
>nv);
>  if (taskid == 0) // MASTER)
>  {
>
>    if(alfa==0){
>      zv=(double *) calloc(sizeof(double),nv*2);
>      zvtemp=(double *) calloc(sizeof(double),nv*2);
>    }else{
>      zv=(double *) calloc(sizeof(double),nv*2);
>      for(f=0;f<(alfa*MR*2);f++){
>	zv[f]=zvtemp[f];
>      }
>      zvtemp=(double *) calloc(sizeof(double),nv*2);
>    }
>
>    if(alfa==0)
>    {
>     //coordinates
>
>       ze[0][0]=0.0001;
>       ze[1][0]=0.0002;
>       ze[2][0]=0.0003;
>       ze[0][1]=0.0004;
>       ze[1][1]=0.0005;
>       ze[2][1]=0.0006;
>
>
>       for(j=0;j<np;j++)
>       {
>         //end points
>         ze1[j][0]=ze[j][0];
>         ze2[j][0]=ze[j+1][0];
>         ze1[j][1]=ze[j][1];
>         ze2[j][1]=ze[j+1][1];
>         //normal vector
>         dx=ze2[j][0]-ze1[j][0];
>         dy=ze2[j][1]-ze1[j][1];
>         dels[j]=sqrt(dx*dx+dy*dy);
>         n[j][0]=-dy/dels[j];
>         n[j][1]=dx/dels[j];
>         //control points
>         zc[j][0]=0.5*(ze1[j][0]+ze2[j][0]);
>         zc[j][1]=0.5*(ze1[j][1]+ze2[j][1]);
>         //generation points
>         zg[j][0]=zc[j][0]+eps*n[j][0];
>         zg[j][1]=zc[j][1]+eps*n[j][1];
>       }
>
>       //container of points
>       jj=0;
>       for(j=0;j<nv;j++)
>       {
>
>         zv[jj]=ze[j][0];
>         zv[jj+1]=ze[j][1];
>	   zvtemp[jj]=ze[j][0];
>         zvtemp[jj+1]=ze[j][1];
>
>	 jj+=2;
>       }
>       }else{
>      //new points coordinates
>      jj=0;
>      h=0;
>      for(j=(alfa*MR*2); j<((alfa*MR*2)+(MR*2)); j++)
>      {
>
>        zv[(alfa*MR*2)+jj]=ze[h][0];
>        zv[(alfa*MR*2)+jj+1]=ze[h][1];
>
>	jj+=2;
>	h++;
>
>      }
>
>    }
>    // test
>     for(f=0;f<nv*2;f++){
>       if ((f%2)==0)
>	  printf("----------\n");
>       printf("alfa=%d   %d:%f ",alfa,f,zv[f]);
>     }
>
>
>    /* new lines */
>    averow = nv/numworkers;
>    extra = nv%numworkers;
>    offset = 0;
>    mtype = FROM_MASTER;
>
>    for (dest=1; dest<=numworkers; dest++)
>/* send message to workers */
>    {
>      rows = (dest <= extra) ? averow+1 : averow;
>      MPI_Send(&offset, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
>      MPI_Send(&rows, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
>      MPI_Send(&zv[offset*2], rows*2, MPI_DOUBLE, dest, mtype,
>MPI_COMM_WORLD);
>
>      offset = offset + (rows);
>    }
>    mtype = FROM_WORKER;
>    for (i=1; i<=numworkers; i++)
>/* receive the results */
>    {
>      source = i;
>
>      MPI_Recv(&offset, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
>      MPI_Recv(&rows, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
>
>      MPI_Recv(&zvtemp[offset*2], rows*2, MPI_DOUBLE, source, mtype,
>MPI_COMM_WORLD, &status);
>
>    }
>    //new points
>
>  }
>  if(taskid > MASTER) /* workers tasks */
>  {
>    zv=(double *) calloc(sizeof(double),2*nv);
>    mtype = FROM_MASTER;
>    Roffset=0;
>    Rrows=0;
>    /* receive message from master */
>    MPI_Recv(&Roffset, 1, MPI_INT, 0,mtype, MPI_COMM_WORLD, &status);
>    MPI_Recv(&Rrows, 1, MPI_INT,0,mtype, MPI_COMM_WORLD, &status);
>
>    MPI_Recv(&zv[0], Rrows*2, MPI_DOUBLE,0,mtype,MPI_COMM_WORLD, &status);
>    //displacement of points
>    for(i=0;i<Rrows*2;i++)
>    {
>        zv[i]=zv[i]*10;  // MEU TESTE
>
>    }
>
>    mtype = FROM_WORKER;
>
>    MPI_Send(&Roffset, 1, MPI_INT, 0, mtype, MPI_COMM_WORLD);
>    MPI_Send(&Rrows, 1, MPI_INT, 0, mtype, MPI_COMM_WORLD);
>    MPI_Send(&zv[0], Rrows*2, MPI_DOUBLE,0, mtype, MPI_COMM_WORLD);
>    free(zv);
>  }
>  nv=nv+MR;
>
>  //steep end
>}
>
>MPI_Finalize(); /* end */
>if(taskid==0){
> free(zv);
> free(zvtemp);
>}
>
>}
>
>
>--
>Open WebMail Project (http://openwebmail.org)
>
>_______________________________________________
>This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>