I know this is for LAM, but I hope someone may be able to answer my question in MPICH.
I am trying to create point-to-point communication profile for each nodes.
In my profiling library, the MPI_Init is as following:
int MPI_Init( argc, argv )
int * argc;
char *** argv;
{ .......
printf("\n");
for (i=0;i<(*argc);i++)
{printf("%s#",(*argv)[i]);}
startTick=times(NULL);
printf("\n breakpoint1 \n");
returnVal = PMPI_Init( argc, argv );
printf("\n breakpoint2 \n");
stopTick=times(NULL);
..........
}
I want to name the profiling file with the application name and the rank.
Rocks 4.0 is installed on my clusters, and the frontend is different with the compute nodes.
The compute nodes are all the sam dual-cpu machines.
And I find something strange in the output of argv[].
[qiang_at_grid11 mympiProftest]$ mpirun -np 4 -nolocal -machinefile m2 ./g
/home/qiang/mympiProftest/./g#-p4pg#/home/qiang/mympiProftest/PI26382#-p4wd#/home/qiang/mympiProftest#
/home/qiang/mympiProftest/./g#c0-1#33375#-p4amslave#-p4yourname#c0-2#-p4rmrank#1#
/home/qiang/mympiProftest/./g#c0-1#33375#-p4amslave#-p4yourname#c0-1#-p4rmrank#2#
/home/qiang/mympiProftest/./g#c0-1#33375#-p4amslave#-p4yourname#c0-2#-p4rmrank#3#
Greeting from process 1!
Greeting from process 2!
Greeting from process 3!
---------------------------------------------------------
At the frontend I issue the mpirun, I use the "nolocal" option and m2 is (c0-1 c0-2)
who can explain the output, so I could use it to decide the profile file name.
/home/qiang/mympiProftest/./g#-p4pg#/home/qiang/mympiProftest/PI26382#-p4wd#/home/qiang/mympiProftest#
| | | |
application name hostname? process id? ?
and
/home/qiang/mympiProftest/./g#c0-1#33375#-p4amslave#-p4yourname#c0-2#-p4rmrank#1#
Why the hostname of the output is different, I specify c0-1 and c0-2 as the compute machine.
Thanks.
Qiang
|