Try outputting to a file instead of to stdout (you might want to ensure
to write to *separate* files, as having multiple processes --
potentially on different nodes -- writing to the same file will likely
have non-deterministic results). I think the results will be
enlightening. :-)
In short: standard output / standard error cannot be used for reliable
output in a parallel environment, particularly when you are dealing
with race conditions involving new processes arriving and old processes
dying. What I think is happening is that you're seeing a race
condition where mpirun / the parent process is dying before the
children's stdout/stderr have a chance to reach your output. This
doesn't mean that they didn't run; it only means that you didn't see
their output.
(for example, see this thread:
http://www.lam-mpi.org/MailArchives/lam/msg09368.php)
On Dec 9, 2004, at 4:49 AM, Gabriel Antoine Louis Paillard wrote:
> Good Morning,
>
> Thanks again for your answer, but unfortunately I have some doubts
> yet.
>
>
>>
>> I'm not quite sure that I understand... I'm not sure where you got
>> the
>> 8+7 numbers of there. The last version of your program that you sent,
>> assuming that you "mpirun -np 8 your_program", will run MPI_COMM_SPAWN
>> exactly 8 times -- once in each process that was launched by mpirun.
>> Since you used MPI_COMM_SELF as the communicator to MPI_COMM_SPAWN and
>> specified the root as 0, then each of the 8 processes will be the
>> root,
>> and therefore each will launch (numprocs) new processes with the
>> (argv)
>> that was constructed for that spawning process.
>
> Ok, is exactly was I did, mpirun -np 8 Program, with:
>
> MPI_Comm_spawn(command,argv,1,MPI_INFO_NULL,mon_rang,MPI_COMM_SELF,&nou
> veau_monde,&errcode);
>
> But the output is just give by one children process and not by the
> eight
> created processes. And was for that reason that I tried to put
> maxprocs=8, and when I did that, I obtained the answer of the 8new
> processes + 7 answers for the process ranking at 0 in the
> MPI_COMM_WORLD
> of origin (7 equals answers).
>
> Below I putted the slave program and the main program:
>
> ("Esclave 1 called by the main program")
>
> #include <stdio.h>
> #include <mpi.h>
>
> int main(int argc,char *argv[])
> {
> unsigned long int nombre;
> MPI_Comm parent;
>
>
> MPI_Init(&argc, &argv);
>
> MPI_Comm_get_parent(&parent);
>
> printf("%s\n",argv[1]);
>
> exit(0);
> }
>
> Main Program:
>
> #include <stdio.h>
> #include <mpi.h>
>
>
> int slave ();
>
> int main(int argc,char *argv[])
> {
> int myrank;
> double starttime,endtime;
>
> MPI_Init(&argc, &argv);
> starttime = MPI_Wtime();
> MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
>
> switch(myrank) {
> case 0:
> slave(1);
> break;
> case 1:
> slave(7);
> break;
> case 2:
> slave(11);
> break;
> case 3:
> slave(13);
> break;
> case 4:
> slave(17);
> break;
> case 5:
> slave(19);
> break;
> case 6:
> slave(23);
> break;
> case 7:
> slave(29);
> break;
> }
> endtime = MPI_Wtime();
> printf("Mesure du temps: %1.15f\n", endtime-starttime);
>
> MPI_Finalize();
> exit(0);
> }
>
> int slave (unsigned long int argument)
> {
> char command[] = "Esclave1";
> char **argv;
> int errcode;
> int mon_rang,ntasks,err;
> unsigned long int nouveau_processus;
> MPI_Comm nouveau_monde;
>
> mon_rang=MPI_Comm_rank(MPI_COMM_WORLD, &mon_rang);
>
> argv=(char **)malloc(2*sizeof(char *));
> nouveau_processus = argument+30;
> argv[0]=(char*)malloc(10*sizeof(char));
> sprintf(argv[0],"%ld",nouveau_processus);
> argv[1] = NULL;
>
> MPI_Comm_set_errhandler(MPI_COMM_WORLD,MPI_ERRORS_RETURN);
>
> err=MPI_Comm_spawn(command,argv,1,MPI_INFO_NULL,mon_rang,MPI_COMM_SELF,
> &nouveau_monde,&errcode);
>
> return(0);
> }
>
>
> for maxprocs=1, the output is:
>
>> mpirun -np 8 DistributedWheelSieve7
> 31
>
>
> and for maxprocs=8, we have:
>
> mpirun -np 8 DistributedWheelSieve7
> 47
> 37
> 53
> 41
> 43
> 49
> 31
> 31
> 31
> 31
> 31
> 31
> 31
> 31
> 59
>
>
> Thank you again,
>
> Gabriel
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
--
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
|