On Wed, Sep 17, 2003 at 04:34:04PM +0200, Jean-Marie Teuler wrote:
>Is it possible to cheat the tm module so that it enlists instead
>node1-1000... node4-1000?
I second this request.
We have weird network topologies on some of our clusters (as Jeff knows :-)
but a common feature is a fast and a slow set of names - slow for daily use,
fast just for MPI.
I believe LAM uses standard 'gethostbyname' functions to find the
primary interface. Perhaps it's possible to LD_PRELOAD a tiny shared
library before booting LAM which replaces 'gethostbyname' (or similar)
with a version that returns the fast network name?
Pretty ugly, but might work.
Some mechanism in LAM itself would be (of course) better.
Without using tm it's pretty easy - we just translate the nodes list
that PBS gives us (using a python script) to the fast names and lamboot
with that. But yeah, tm is a problem.
cheers,
robin
|