All nodes of my beowulf system didn't execute lamboot.
When I ran lamboot, it failed always and error message is that,
"base: cannot find process schema (null): No such file or directory".
The result of lamboot -d is below,
[xenus_at_node8:~]$ cat lamhosts
node8 cpu=2
[xenus_at_node8:~]$ lamboot -d lamhosts
n0<267> ssi:boot: Opening
n0<267> ssi:boot: opening module globus
n0<267> ssi:boot: initializing module globus
n0<267> ssi:boot:globus: globus-job-run not found, globus boot will not run
n0<267> ssi:boot: module not available: globus
n0<267> ssi:boot: opening module rsh
n0<267> ssi:boot: initializing module rsh
n0<267> ssi:boot:rsh: module initializing
n0<267> ssi:boot:rsh:agent: /usr/bin/rsh
n0<267> ssi:boot:rsh:username: <same>
n0<267> ssi:boot:rsh:verbose: 1000
n0<267> ssi:boot:rsh:algorithm: linear
n0<267> ssi:boot:rsh:priority: 10
n0<267> ssi:boot: module available: rsh, priority: 10
n0<267> ssi:boot: finalizing module globus
n0<267> ssi:boot:globus: finalizing
n0<267> ssi:boot: closing module globus
n0<267> ssi:boot: Selected boot module rsh
LAM 7.0.4/MPI 2 C++/ROMIO - Indiana University
n0<267> ssi:boot:base: looking for boot schema in following directories:
n0<267> ssi:boot:base: <current directory>
n0<267> ssi:boot:base: $TROLLIUSHOME/etc
n0<267> ssi:boot:base: $LAMHOME/etc
n0<267> ssi:boot:base: /usr/lib/lam/etc
n0<267> ssi:boot:base: looking for boot schema file:
n0<267> ssi:boot:base: lamhosts
n0<267> ssi:boot:base: found boot schema: lamhosts
n0<267> ssi:boot:rsh: found the following hosts:
n0<267> ssi:boot:rsh: n0 node8 (cpu=2)
n0<267> ssi:boot:rsh: resolved hosts:
n0<267> ssi:boot:rsh: n0 node8 --> 192.168.42.8 (origin)
n0<267> ssi:boot:rsh: starting RTE procs
n0<267> ssi:boot:base:linear: starting
n0<267> ssi:boot:base:server: opening server TCP socket
n0<267> ssi:boot:base:server: opened port 32782
n0<267> ssi:boot:base:linear: booting n0 (node8)
n0<267> ssi:boot:rsh: starting lamd on (node8)
n0<267> ssi:boot:rsh: starting on n0 (node8): hboot -t -c lam-conf.lamd -d -I
-H 192.168.42.8 -P 32782 -n 0 -o 0
n0<267> ssi:boot:rsh: launching locally
base: cannot find process schema (null): No such file or directory
-----------------------------------------------------------------------------
*** Oops -- cannot find the help that you're supposed to get.
*** Using the following help file:
***
*** /usr/lib/lam/etc/lam-helpfile
***
*** You were supposed to get help on the program "hboot"
*** about the topic "cant-parse-config"
*** But it doesn't seem to be in that file.
***
*** Sorry!
-----------------------------------------------------------------------------
n0<267> ssi:boot:base:linear: Failed to boot n0 (node8)
n0<267> ssi:boot:base:server: closing server socket
n0<267> ssi:boot:base:linear: aborted!
-----------------------------------------------------------------------------
*** Oops -- cannot find the help that you're supposed to get.
*** Using the following help file:
***
*** /usr/lib/lam/etc/lam-helpfile
***
*** You were supposed to get help on the program "boot"
*** about the topic "about-to-wipe"
*** But it doesn't seem to be in that file.
***
*** Sorry!
-----------------------------------------------------------------------------
lamboot: wipe -- nothing to do
lamboot did NOT complete successfully
Any ideas?
--
Sincerely, Kiyoung
|