The new -prefix command on lamboot seems to be a new powerfull feature to use when you want to run multiple lam
installations. I built a recent beta lam and tried to use the -prefix command and got this error :
My lamboot command is as follows and I'm running using PBS :
lamboot -d -prefix /fltapps/boeing/cfd/mpi/lam7.1.1_pgf
Here is the debug output :
n-1<30129> ssi:boot:open: opening
n-1<30129> ssi:boot:open: opening boot module globus
n-1<30129> ssi:boot:open: opened boot module globus
n-1<30129> ssi:boot:open: opening boot module rsh
n-1<30129> ssi:boot:open: opened boot module rsh
n-1<30129> ssi:boot:open: opening boot module slurm
n-1<30129> ssi:boot:open: opened boot module slurm
n-1<30129> ssi:boot:open: opening boot module tm
n-1<30129> ssi:boot:open: opened boot module tm
n-1<30129> ssi:boot:select: initializing boot module tm
n-1<30129> ssi:boot:tm: module initializing
n-1<30129> ssi:boot:tm:verbose: 1000
n-1<30129> ssi:boot:tm:priority: 75
n-1<30129> ssi:boot:select: boot module available: tm, priority: 75
n-1<30129> ssi:boot:select: initializing boot module slurm
n-1<30129> ssi:boot:slurm: not running under SLURM
n-1<30129> ssi:boot:select: boot module not available: slurm
n-1<30129> ssi:boot:select: initializing boot module rsh
n-1<30129> ssi:boot:rsh: module initializing
n-1<30129> ssi:boot:rsh:agent: rsh
n-1<30129> ssi:boot:rsh:username: <same>
n-1<30129> ssi:boot:rsh:verbose: 1000
n-1<30129> ssi:boot:rsh:algorithm: linear
n-1<30129> ssi:boot:rsh:no_n: 0
n-1<30129> ssi:boot:rsh:no_profile: 0
n-1<30129> ssi:boot:rsh:fast: 0
n-1<30129> ssi:boot:rsh:ignore_stderr: 0
n-1<30129> ssi:boot:rsh:priority: 10
n-1<30129> ssi:boot:select: boot module available: rsh, priority: 10
n-1<30129> ssi:boot:select: initializing boot module globus
n-1<30129> ssi:boot:globus: globus-job-run not found, globus boot will not run
n-1<30129> ssi:boot:select: boot module not available: globus
n-1<30129> ssi:boot:select: finalizing boot module slurm
n-1<30129> ssi:boot:slurm: finalizing
n-1<30129> ssi:boot:select: closing boot module slurm
n-1<30129> ssi:boot:select: finalizing boot module rsh
n-1<30129> ssi:boot:rsh: finalizing
n-1<30129> ssi:boot:select: closing boot module rsh
n-1<30129> ssi:boot:select: finalizing boot module globus
n-1<30129> ssi:boot:globus: finalizing
n-1<30129> ssi:boot:select: closing boot module globus
n-1<30129> ssi:boot:select: selected boot module tm
n-1<30129> ssi:boot:tm: found the following 4 hosts:
n-1<30129> ssi:boot:tm: n0 hsd354 (cpu=2)
n-1<30129> ssi:boot:tm: n1 hsd352 (cpu=2)
n-1<30129> ssi:boot:tm: n2 hsd351 (cpu=2)
n-1<30129> ssi:boot:tm: n3 hsd350 (cpu=2)
n-1<30129> ssi:boot:tm: starting RTE procs
n-1<30129> ssi:boot:base:linear_windowed: starting
n-1<30129> ssi:boot:base:linear_windowed: window size: 5
n-1<30129> ssi:boot:base:server: opening server TCP socket
n-1<30129> ssi:boot:base:server: opened port 35051
n-1<30129> ssi:boot:base:linear_windowed: booting n0 (hsd354)
n-1<30129> ssi:boot:tm: starting wipe on (hsd354)
n-1<30129> ssi:boot:tm: starting on n0 (hsd354): /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/bin/tkill -setsid -d
n-1<30129> ssi:boot:tm: successfully launched on n0 (hsd354)
n-1<30129> ssi:boot:tm: waiting for completion on n0 (hsd354)
n-1<30129> ssi:boot:tm: finished on n0 (hsd354)
n-1<30129> ssi:boot:tm: starting lamd on (hsd354)
base: cannot open lam-conf.lamd: No such file or directory
-----------------------------------------------------------------------------
hboot could not parse the boot configuration file. A number
of problems can result in this error messages:
- Is the configuration file installed properly?
- Did you specify a file name that does not exit when
using the -c option to lamboot?
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host hsd354.
This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamhalt" command.
Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
The file lam-conf.lamd exists in my LAMHOME/etc directory. Please note that the job runs if I remove
the -prefix from my lamboot command.
My laminfo -all
LAM/MPI: 7.2b1svn9913
SSI boot: globus (SSI v1.0, API v1.1, Module v0.6)
SSI boot: rsh (SSI v1.0, API v1.1, Module v1.1)
SSI boot: slurm (SSI v1.0, API v1.1, Module v1.0)
SSI boot: tm (SSI v1.0, API v1.1, Module v1.1)
SSI coll: lam_basic (SSI v1.0, API v1.1, Module v7.1)
SSI coll: shmem (SSI v1.0, API v1.1, Module v1.0)
SSI coll: smp (SSI v1.0, API v1.1, Module v1.2)
SSI rpi: crtcp (SSI v1.0, API v1.1, Module v1.1)
SSI rpi: lamd (SSI v1.0, API v1.0, Module v7.1)
SSI rpi: sysv (SSI v1.0, API v1.0, Module v7.1)
SSI rpi: tcp (SSI v1.0, API v1.0, Module v7.1)
SSI rpi: usysv (SSI v1.0, API v1.0, Module v7.1)
SSI cr: self (SSI v1.0, API v1.0, Module v1.0)
Prefix: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf
Bindir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/bin
Libdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/lib
Incdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/include
Pkglibdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/lib/lam
Sysconfdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/etc
Architecture: i686-pc-linux-gnu
Configured by: borensbs
Configured on: Tue Nov 16 07:25:19 PST 2004
Configure host: li13200
Memory manager: none
C bindings: yes
C++ bindings: yes
Fortran bindings: yes
C compiler: pgcc
C char size: 1
C bool size: 1
C short size: 2
C int size: 4
C long size: 4
C float size: 4
C double size: 8
C pointer size: 4
C char align: 1
C bool align: 1
C int align: 4
C float align: 4
C double align: 8
C++ compiler: pgCC
Fortran compiler: pgf77
Fortran symbols: underscore
Fort integer size: 4
Fort real size: 4
Fort dbl prec size: 4
Fort cplx size: 4
Fort dbl cplx size: 4
Fort integer align: 4
Fort real align: 4
Fort dbl prec align: 4
Fort cplx align: 4
Fort dbl cplx align: 4
C profiling: yes
C++ profiling: yes
Fortran profiling: yes
C++ exceptions: no
Thread support: yes
ROMIO support: yes
IMPI support: no
Debug support: no
Purify clean: no
SSI base: parameter "verbose" (default value: <none>)
SSI mpi: parameter "mpi_hostmap" (default value:
"/fltapps/boeing/cfd/mpi/lam7.1.1_pgf/etc/lam-hostmap.txt")
SSI base: parameter "base_module_path" (default value:
"/fltapps/boeing/cfd/mpi/lam7.1.1_pgf/lib/lam")
SSI boot: parameter "boot_verbose" (default value: <none>)
SSI boot: parameter "boot" (default value: <none>)
SSI boot: parameter "boot_base_promisc" (default value: "0")
SSI boot: parameter "boot_base_window_size" (default value: "5")
SSI boot: parameter "boot_globus_priority" (default value: "3")
SSI boot: parameter "boot_rsh_username" (default value: <none>)
SSI boot: parameter "boot_rsh_agent" (default value: "rsh")
SSI boot: parameter "boot_rsh_no_n" (default value: "0")
SSI boot: parameter "boot_rsh_no_profile" (default value: "0")
SSI boot: parameter "boot_rsh_fast" (default value: "0")
SSI boot: parameter "boot_rsh_ignore_stderr" (default value: "0")
SSI boot: parameter "boot_rsh_priority" (default value: "10")
SSI boot: parameter "boot_slurm_priority" (default value: "50")
SSI boot: parameter "boot_tm_priority" (default value: "75")
SSI boot: parameter "boot_tm_first" (default value: "-1")
SSI rpi: parameter "rpi_verbose" (default value: <none>)
SSI rpi: parameter "rpi" (default value: <none>)
SSI rpi: parameter "rpi_crtcp_priority" (default value: "25")
SSI rpi: parameter "rpi_crtcp_short" (default value: "65536")
SSI rpi: parameter "rpi_crtcp_sockbuf" (default value: "-1")
SSI rpi: parameter "rpi_lamd_priority" (default value: "20")
SSI rpi: parameter "rpi_sysv_pollyield" (default value: "1")
SSI rpi: parameter "rpi_sysv_poolsize" (default value:
"16777216")
SSI rpi: parameter "rpi_sysv_maxalloc" (default value:
"1048576")
SSI rpi: parameter "rpi_sysv_short" (default value: "8192")
SSI rpi: parameter "rpi_tcp_short" (default value: "65536")
SSI rpi: parameter "rpi_tcp_sockbuf" (default value: "-1")
SSI rpi: parameter "rpi_sysv_priority" (default value: "30")
SSI rpi: parameter "rpi_tcp_priority" (default value: "20")
SSI rpi: parameter "rpi_usysv_readlockpoll" (default value:
"10000")
SSI rpi: parameter "rpi_usysv_writelockpoll" (default value:
"10")
SSI rpi: parameter "rpi_usysv_pollyield" (default value: "1")
SSI rpi: parameter "rpi_usysv_poolsize" (default value:
"16777216")
SSI rpi: parameter "rpi_usysv_maxalloc" (default value:
"1048576")
SSI rpi: parameter "rpi_usysv_short" (default value: "8192")
SSI rpi: parameter "rpi_usysv_priority" (default value: "40")
SSI coll: parameter "coll_verbose" (default value: <none>)
SSI coll: parameter "coll_shmem" (default value: "0")
SSI cr: parameter "cr_verbose" (default value: <none>)
SSI cr: parameter "cr" (default value: <none>)
SSI cr: parameter "cr_self_priority" (default value: "25")
SSI cr: parameter "cr_self_do_restart" (default value: "0")
SSI cr: parameter "cr_self_prefix" (default value:
"lam_cr_self")
SSI cr: parameter "cr_self_checkpoint" (default value: <none>)
SSI cr: parameter "cr_self_continue" (default value: <none>)
SSI cr: parameter "cr_self_restart" (default value: <none>)
Thanx for a great product.
Bernie Borenstein
The Boeing Company
|