LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Marty - ½²¨Ê¨j (cs87668_at_[hidden])
Date: 2004-01-09 00:30:52


Dear all,

I have a problem,

I am developing a job scheduling system for PC Cluster system.
So, I want to control parallel job (MPI), and I must get PIDs of all MPI
processes.
Unfortunately, I can't get PID of all MPI processes in remote nodes.

For example,
I execute "mpirun -np 3 ./cpi" at server node and then the MPI program will
run in node 1, 2 and 3.

How do I get the PID in node1, 2 and 3 ?

I tried using "ps" command to get PID by match user ID and program name,
but there are still some problem is that, if a user run several jobs with
the same program name,
then I can't identify the program.

I traced source code of OpenPBS, but I can't the solution.!
( OpenPBS can handle this well. )

How do I get the PID in node1, 2 and 3 ?

thank you~