I am trying to boot a cluster through LAM, so to do so, edited the lam-bhost.def
file. I manage to lamboot all nodes in the cluster, but when I use a command
such as:
mpirun n0 n1-4 <filename>
-or-
lamexec n0 n1-4 <filename>
the program seems to boot all on the first node, ie, runs the same program 5
times in a row on the main screen. this occurs, or once i received an error
message saying that the IP address 127.0.0.1 (localhost) (naming it n0) would
not accept. However, I have set my private network IP addresses in the
192.168.xx.yy range. Why is it reading the localhost as n0 if I've lamboot-ed
the cluster from my lam-bhost.def file, where all the other nodes are named?
And how do I fix it?
Thanks so much!
Deanna
|