This is not a bug, but rather a limitation, and the
good news is that it is almost "fixed".
The problem is that only the 1st process is started by
pbs_mon, the other processes on other hosts are
started by the rsh daemon, which knows nothing about
your PBS policies.
A nice guy, Pete Wyckoff, has written MpiExec, which
allows all processes, not only the 1st one, to be
executed by pbs_mon on each of the nodes.
http://www.osc.edu/~pw/mpiexec/
Another benefit is that parallel jobs can now be
killed, also job accounting now works for all
processes.
Work is "under way" to support LAM, but I don't know
the details.
Pete, can you tell us what works under LAM and what
doesn't? Thanks.
-Ron
--- Michael Sabielny wrote:
> ... only the node with the "Mother Superior",
> the pbs_mom on the first allocated node, has its
> desired nice level of -20.
__________________________________________________
Do You Yahoo!?
Check out Yahoo! Shopping and Yahoo! Auctions for all of
your unique holiday gifts! Buy at http://shopping.yahoo.com
or bid at http://auctions.yahoo.com
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|