I tried searching for this in the archives but the search function on the web site seems broken.
When I try to use LAM MPI with torque, I get an error message saying the following: "The boot SSI tm module found that your local host is not in the
hostfile "PBS_NODEFILE"." This happens when I run lamboot and if I directly specify the nodes file with lamboot $PBS_NODEFILE.
However, when I cat the nodes file I see plainly that node n27 is in fact in the file and when I run hostname it verifies that I am on host n27. When I run lamboot with -d, I get the following:
n-1<18466> ssi:boot:tm: found the following 1 hosts:
n-1<18466> ssi:boot:tm: n0 n27 (cpu=1)
What is the tm module looking for that isn't being correctly provided? I am running lam 7.1.4 with torque 2.3.4.
|