On Thu, 12 Sep 2002, Cere M. Davis wrote:
> I'm running Lam 6.5.6 on a Debian Linux system running openMosix 2.4.19-3.
> I don't know if running openmosix is related to this but. I am not able
> to see remote lam processes on remote lam nodes. The processes don't seem
> to exist in the /proc filesystem so it makes sense why I cannot see the
> processes in ps or top but it doesn't make sense why I cannot even see the
> lam virtual machine container process on the remote nodes? If, for
> example node 3 is a remote node, I run 'lamexec -n3 ps -ef' than I can see
> the processes correctly...which is what I would expect. But how is it
> that lam bypasses the kernel's proc filesystem as a running binary?
LAM/MPI does not do anything to bypass the accounting system in Linux. My
guess is that you are hitting an accounting limitation in the openMosix
setup.
I have no experience with OpenMosx, so I really can't be of much help.
All I can do is explain what LAM does during boot.
Lamboot starts an rsh/ssh shell on the remote node and launched hboot.
Hboot fork/execs "lamd" and then exits, causing the rsh/ssh to return. At
this point, lamboot collects some information from all the lamds and
broadcasts it back out. And the LAM environment is off and running.
My guess is that OpenMosix is losing track of the processes after the
RSH/SSH, which is why they don't show up where you might expect them. The
proc information on the node on which a LAM daemon is running should
always contain information about that lamd. If not, you should talk to
your system provider.
Brian
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have a LAM/MPI day: http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|