Greetings
I am running into a problem lambooting with lam 7.1. I have localhost in
my host file, and lamboot detects it but gives me an error message that
localhost is not in the host file. I dug up something close to this but
it seems it was fixed in 7.0.6. Any ideas?
yye00_at_Arbitrator work $ lamboot -d
n-1<6712> ssi:boot:open: opening
n-1<6712> ssi:boot:open: opening boot module globus
n-1<6712> ssi:boot:open: opened boot module globus
n-1<6712> ssi:boot:open: opening boot module rsh
n-1<6712> ssi:boot:open: opened boot module rsh
n-1<6712> ssi:boot:open: opening boot module slurm
n-1<6712> ssi:boot:open: opened boot module slurm
n-1<6712> ssi:boot:select: initializing boot module slurm
n-1<6712> ssi:boot:slurm: not running under SLURM
n-1<6712> ssi:boot:select: boot module not available: slurm
n-1<6712> ssi:boot:select: initializing boot module rsh
n-1<6712> ssi:boot:rsh: module initializing
n-1<6712> ssi:boot:rsh:agent: rsh
n-1<6712> ssi:boot:rsh:username: <same>
n-1<6712> ssi:boot:rsh:verbose: 1000
n-1<6712> ssi:boot:rsh:algorithm: linear
n-1<6712> ssi:boot:rsh:no_n: 0
n-1<6712> ssi:boot:rsh:no_profile: 0
n-1<6712> ssi:boot:rsh:fast: 0
n-1<6712> ssi:boot:rsh:ignore_stderr: 0
n-1<6712> ssi:boot:rsh:priority: 10
n-1<6712> ssi:boot:select: boot module available: rsh, priority: 10
n-1<6712> ssi:boot:select: initializing boot module globus
n-1<6712> ssi:boot:globus: globus-job-run not found, globus boot will
not run
n-1<6712> ssi:boot:select: boot module not available: globus
n-1<6712> ssi:boot:select: finalizing boot module slurm
n-1<6712> ssi:boot:slurm: finalizing
n-1<6712> ssi:boot:select: closing boot module slurm
n-1<6712> ssi:boot:select: finalizing boot module globus
n-1<6712> ssi:boot:globus: finalizing
n-1<6712> ssi:boot:select: closing boot module globus
n-1<6712> ssi:boot:select: selected boot module rsh
LAM 7.1/MPI 2 C++/ROMIO - Indiana University
n-1<6712> ssi:boot:base: looking for boot schema in following
directories:
n-1<6712> ssi:boot:base: <current directory>
n-1<6712> ssi:boot:base: $TROLLIUSHOME/etc
n-1<6712> ssi:boot:base: $LAMHOME/etc
n-1<6712> ssi:boot:base: /etc/lam-mpi
n-1<6712> ssi:boot:base: looking for boot schema file:
n-1<6712> ssi:boot:base: lam-bhost.def
n-1<6712> ssi:boot:base: found boot schema: /etc/lam-mpi/lam-bhost.def
n-1<6712> ssi:boot:rsh: found the following hosts:
n-1<6712> ssi:boot:rsh: n0 localhost (cpu=1)
-----------------------------------------------------------------------------
The boot SSI rsh module found that your local host is not in the
hostfile "/etc/lam-mpi/lam-bhost.def".
The local host name *must* be in the list of hosts in the hostfile.
In other words, you must boot LAM from a node that will be part of the
universe.
- If you simply forgot to put the local host in the boot
schema file, add it and re-run The boot SSI rsh module
- If you are trying to boot LAM from a node that will not be
part of the universe, you must login to on of the nodes that
will be part of the universe (i.e., one of the nodes in the
hostfiles), and re-run The boot SSI rsh module
Although the local host name is usually the first in the list to avoid
I/O ambiguities, it can actually appear anywhere in the list.
-----------------------------------------------------------------------------
|