Hi,
I'm trying to generate multiple processes on differents nodes, using MPI_Comm_spawn_multiple and the "host" entry in MPI_Info.
For test purposes, I am running the application on a cluster of only two machines ('node0' and 'node1').
The LAM version is 7.1.2.
My lamhost file looks like the following:
node0
node1
and I am running from node0
recon works OK.
when I run "mpiexec -n 1 ./manager" (the process that Spawn's my 'workers')
I can spawn the processes, but they always run only on 'node0'.
Is that a limitation, that comes with MPI_Comm_Spawn_multiple ?
Shouldn't it be possible to run the children also on the remote node ?
Cheers,
Laurent.
PS : Here is a part of my 'manager' program :
NbExes = 2;
pArray_of_errcodes = calloc( NbExes, sizeof(int) );
pArray_of_maxprocs = calloc( NbExes, sizeof(int) );
pArray_of_commands = calloc( NbExes, sizeof(char*) );
pArray_of_info = calloc( NbExes, sizeof(MPI_Info) );
ppargv = calloc( NbExes, sizeof(char**) );
for( i=0; i<NbExes; i++ ) {
// Name of the program
sprintf( NomExe, "Test" );
pArray_of_commands[i] = calloc( 255, sizeof(char) );
sprintf( pArray_of_commands[i], "%s", NomExe );
// Params
sprintf( ParametresExe, "Param" );
ppargv[i] = calloc( 1, sizeof(char*));
ppargv[i][0] = calloc( 256, sizeof(char) );
strcpy( ppargv[i][0], ParametresExe );
// MPI_Info
pArray_of_maxprocs[i] = 1;
MPI_Info_create( &pArray_of_info[i] );
MPI_Info_set( pArray_of_info[i], "soft", "0:1" );
MPI_Info_set( pArray_of_info[i], "host", NodeName[i] );
MPI_Info_get( pArray_of_info[i], "host", 255, String, &flag );
printf( "=============host = %s, flag = %d\n", String, flag );
}
// Lancement des executables
MPI_Comm_spawn_multiple( NbExes, pArray_of_commands, ppargv,
pArray_of_maxprocs, pArray_of_info, 0,
MPI_COMM_WORLD, &intercomm, pArray_of_errcodes );
using MPI_Info_get, I can see that the 2 hosts are node0 and node1.
using a hostname command in 'test', I see that the 2 programs run on node0.
|