Try looking under /tmp/lam-username_at_hostname on each host. When lambooted
there appears to be a file in this directory called "lam" which appears to
contain the pids of the lam processes running on that machine. The first
appears to be the PID of the lam deamon itself on that machine.
I don't have any documentation that states this, just my observation.
On Thursday 08 January 2004 23:30, Marty - ½²¨Ê¨j wrote:
> Dear all,
>
> I have a problem,
>
> I am developing a job scheduling system for PC Cluster system.
> So, I want to control parallel job (MPI), and I must get PIDs of all MPI
> processes.
> Unfortunately, I can't get PID of all MPI processes in remote nodes.
>
> For example,
> I execute "mpirun -np 3 ./cpi" at server node and then the MPI program will
> run in node 1, 2 and 3.
>
> How do I get the PID in node1, 2 and 3 ?
>
> I tried using "ps" command to get PID by match user ID and program name,
> but there are still some problem is that, if a user run several jobs with
> the same program name,
> then I can't identify the program.
>
> I traced source code of OpenPBS, but I can't the solution.!
> ( OpenPBS can handle this well. )
>
> How do I get the PID in node1, 2 and 3 ?
>
> thank you~
|