i m trying to run the following code to spawn the simple hello file, but
there is a problem with the checkpoint modules.
//hello.c
#include<mpi.h>
int main(int argc,char *argv[])
{
MPI_Init(&argc,&argv);
printf("Hello World\n");
MPI_Finalize();
return 0;
}
//spawnex.c
#include<stdio.h>
#include<mpi.h>
int main(int argc,char *argv[])
{
int myrank;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&myrank);
printf("\nMyrank is %d\n",myrank);
if(myrank == 0)
{
MPI_Comm childcommunicator;
MPI_Info infoobject;
int errcode;
MPI_Info_create(&infoobject);
printf("Trying to spawn",myrank);
MPI_Comm_spawn("hello",MPI_ARGV_NULL,3,infoobject,0,MPI_COMM_SELF,&childcommunicator,&errcode);
printf("\nSpawn successful\n");
}
MPI_Finalize();
return 0;
}
The error when trying to run is
[mpigroup_at_xxx xxxxx]$ mpirun N spawn
Myrank is 0
Myrank is 1
Myrank is 2
-----------------------------------------------------------------------------
It seems that [at least] one of the child processes that was started
by MPI_Comm_spawn* chose a different CR module than the parent
application. For example, one (of the) child process(es) that
differed from the parent is shown below:
Parent application: blcr (v1.1.0)
Child MPI_COMM_WORLD rank 0: none (v-1.-1.-1)
All MPI processes must choose the same CR module and version when
they start. Check your SSI settings and/or the local environment
variables on each node.
-----------------------------------------------------------------------------
Trying to spawnMPI_Comm_spawn: unclassified (rank 0, MPI_COMM_SELF)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Comm_spawn()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
MPI_Recv: process in local group is dead (rank 2, MPI_COMM_WORLD)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Recv()
Rank (1, MPI_COMM_WORLD): - MPI_Barrier()
Rank (1, MPI_COMM_WORLD): - MPI_Finalize()
Rank (1, MPI_COMM_WORLD): - main()
Rank (2, MPI_COMM_WORLD): - MPI_Recv()
Rank (2, MPI_COMM_WORLD): - MPI_Barrier()
Rank (2, MPI_COMM_WORLD): - MPI_Finalize()
Rank (2, MPI_COMM_WORLD): - main()
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
PID 29554 failed on node n2 (172.30.0.143) with exit status 1.
-----------------------------------------------------------------------------
thanx for any help
______________________________
http://www.omnilect.com
Omnilect - 2,000 Megabytes Of Storage... Just For You.
Email, Web Space, Photos, Whatever.
Great Usernames Still Available!
|