It looks like you're trying to use MPI as a parallel launcher, which
isn't quite what MPI is for.
Indeed, you're calling execl(), which will replace the MPI process
with your new /bin/bash process. Therefore, MPI_FINALIZE will not be
executed. As such, LAM treats that as an error.
Note, too, that system() is technically not supported. It'll work
fine on TCP and shared memory, but will not work properly on Myrinet
or IB networks.
You might simply want to use a resource manager to launch your serial
applications in parallel. That might be a bit easier than using MPI.
On Feb 11, 2008, at 10:32 PM, fahad saeed wrote:
> Hello All,
>
> I am a MPI newbie and having a problem.I intended to run a binary on
> different processors, with different data as inputs....i.e.diff
> files are for processing using same binary.....
> My binary takes these arguments....
> binary -in file1 -out file 2
>
> file 1 and file 2 change for each node......
>
> I wrote a small program( small is what i can write rite now :
> ( ...)....The code seems to work fine as
> intended(till it runs...) but exits in between with an error;
>
> In the program, the command for each node is given using a small
> bash file, 1-exec
> 2-exec etc....
>
> error:
> ##########################
> rank 3 in job 83 sapphire.bw01.uic.edu_56332 caused collective
> abort of
> all ranks exit status of rank 3: return code 0
> ######################################################
>
> The code:
>
> #include "mpi.h"
> #include "stdio.h"
> #include <unistd.h>
> int main( int argc, char *argv[] )
> {
> int numprocs, myrank,work,namelen;
> char *file;
> char processor_name[MPI_MAX_PROCESSOR_NAME];
>
>
>
> MPI_Init(&argc, &argv );
> MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
> MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
>
>
> if (myrank) printf("My process rank ==> %d\n",myrank);
>
> if (myrank==1)execl("/bin/bash","bash","1-exec",0);
> if (myrank==2)execl("/bin/bash","bash","2-exec",0);
> if (myrank==3)execl("/bin/bash","bash","3-exec",0);
> if (myrank==4)execl("/bin/bash","bash","4-exec",0);
> if (myrank==5)execl("/bin/bash","bash","5-exec",0);
> if (myrank==6)execl("/bin/bash","bash","6-exec",0);
> if (myrank==7)execl("/bin/bash","bash","7-exec",0);
> if (myrank==8)execl("/bin/bash","bash","8-exec",0);
> if (myrank==9)execl("/bin/bash","bash","9-exec",0);
> if (myrank==10)execl("/bin/bash","bash","10-exec",0);
> if (myrank==11)execl("/bin/bash","bash","11-exec",0);
> if (myrank==12)execl("/bin/bash","bash","12-exec",0);
> if (myrank==13)execl("/bin/bash","bash","13-exec",0);
> if (myrank==14)execl("/bin/bash","bash","14-exec",0);
> if (myrank==15)execl("/bin/bash","bash","15-exec",0);
> if (myrank==16)execl("/bin/bash","bash","16-exec",0);
>
> MPI_Finalize();
> }
>
>
> ****************************
>
> Any suggestions.??I tried to find out the possible cause of the
> error but
> it was discussed very less.All I could learn was that it might be
> due to
> less number of processing units, but I cannot run the above
> program with
> even less number of processes for example I have atleast 6 processing
> units and running the program for even 1 would exit with the error.
>
> Thanks alot,
>
> Fahad Saeed
>
> Helping your favorite cause is as easy as instant messaging. You IM,
> we give. Learn more. _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
Jeff Squyres
Cisco Systems
|