LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Borenstein, Bernard S (bernard.s.borenstein_at_[hidden])
Date: 2004-11-18 12:53:03


The new -prefix command on lamboot seems to be a new powerfull feature to use when you want to run multiple lam
installations. I built a recent beta lam and tried to use the -prefix command and got this error :

My lamboot command is as follows and I'm running using PBS :

lamboot -d -prefix /fltapps/boeing/cfd/mpi/lam7.1.1_pgf

Here is the debug output :

n-1<30129> ssi:boot:open: opening
n-1<30129> ssi:boot:open: opening boot module globus
n-1<30129> ssi:boot:open: opened boot module globus
n-1<30129> ssi:boot:open: opening boot module rsh
n-1<30129> ssi:boot:open: opened boot module rsh
n-1<30129> ssi:boot:open: opening boot module slurm
n-1<30129> ssi:boot:open: opened boot module slurm
n-1<30129> ssi:boot:open: opening boot module tm
n-1<30129> ssi:boot:open: opened boot module tm
n-1<30129> ssi:boot:select: initializing boot module tm
n-1<30129> ssi:boot:tm: module initializing
n-1<30129> ssi:boot:tm:verbose: 1000
n-1<30129> ssi:boot:tm:priority: 75
n-1<30129> ssi:boot:select: boot module available: tm, priority: 75
n-1<30129> ssi:boot:select: initializing boot module slurm
n-1<30129> ssi:boot:slurm: not running under SLURM
n-1<30129> ssi:boot:select: boot module not available: slurm
n-1<30129> ssi:boot:select: initializing boot module rsh
n-1<30129> ssi:boot:rsh: module initializing
n-1<30129> ssi:boot:rsh:agent: rsh
n-1<30129> ssi:boot:rsh:username: <same>
n-1<30129> ssi:boot:rsh:verbose: 1000
n-1<30129> ssi:boot:rsh:algorithm: linear
n-1<30129> ssi:boot:rsh:no_n: 0
n-1<30129> ssi:boot:rsh:no_profile: 0
n-1<30129> ssi:boot:rsh:fast: 0
n-1<30129> ssi:boot:rsh:ignore_stderr: 0
n-1<30129> ssi:boot:rsh:priority: 10
n-1<30129> ssi:boot:select: boot module available: rsh, priority: 10
n-1<30129> ssi:boot:select: initializing boot module globus
n-1<30129> ssi:boot:globus: globus-job-run not found, globus boot will not run
n-1<30129> ssi:boot:select: boot module not available: globus
n-1<30129> ssi:boot:select: finalizing boot module slurm
n-1<30129> ssi:boot:slurm: finalizing
n-1<30129> ssi:boot:select: closing boot module slurm
n-1<30129> ssi:boot:select: finalizing boot module rsh
n-1<30129> ssi:boot:rsh: finalizing
n-1<30129> ssi:boot:select: closing boot module rsh
n-1<30129> ssi:boot:select: finalizing boot module globus
n-1<30129> ssi:boot:globus: finalizing
n-1<30129> ssi:boot:select: closing boot module globus
n-1<30129> ssi:boot:select: selected boot module tm
n-1<30129> ssi:boot:tm: found the following 4 hosts:
n-1<30129> ssi:boot:tm: n0 hsd354 (cpu=2)
n-1<30129> ssi:boot:tm: n1 hsd352 (cpu=2)
n-1<30129> ssi:boot:tm: n2 hsd351 (cpu=2)
n-1<30129> ssi:boot:tm: n3 hsd350 (cpu=2)
n-1<30129> ssi:boot:tm: starting RTE procs
n-1<30129> ssi:boot:base:linear_windowed: starting
n-1<30129> ssi:boot:base:linear_windowed: window size: 5
n-1<30129> ssi:boot:base:server: opening server TCP socket
n-1<30129> ssi:boot:base:server: opened port 35051
n-1<30129> ssi:boot:base:linear_windowed: booting n0 (hsd354)
n-1<30129> ssi:boot:tm: starting wipe on (hsd354)
n-1<30129> ssi:boot:tm: starting on n0 (hsd354): /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/bin/tkill -setsid -d
n-1<30129> ssi:boot:tm: successfully launched on n0 (hsd354)
n-1<30129> ssi:boot:tm: waiting for completion on n0 (hsd354)
n-1<30129> ssi:boot:tm: finished on n0 (hsd354)
n-1<30129> ssi:boot:tm: starting lamd on (hsd354)
base: cannot open lam-conf.lamd: No such file or directory
-----------------------------------------------------------------------------
hboot could not parse the boot configuration file. A number
of problems can result in this error messages:

  - Is the configuration file installed properly?
  - Did you specify a file name that does not exit when
    using the -c option to lamboot?
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host hsd354.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamhalt" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------

The file lam-conf.lamd exists in my LAMHOME/etc directory. Please note that the job runs if I remove
the -prefix from my lamboot command.

My laminfo -all

            LAM/MPI: 7.2b1svn9913
            SSI boot: globus (SSI v1.0, API v1.1, Module v0.6)
            SSI boot: rsh (SSI v1.0, API v1.1, Module v1.1)
            SSI boot: slurm (SSI v1.0, API v1.1, Module v1.0)
            SSI boot: tm (SSI v1.0, API v1.1, Module v1.1)
            SSI coll: lam_basic (SSI v1.0, API v1.1, Module v7.1)
            SSI coll: shmem (SSI v1.0, API v1.1, Module v1.0)
            SSI coll: smp (SSI v1.0, API v1.1, Module v1.2)
             SSI rpi: crtcp (SSI v1.0, API v1.1, Module v1.1)
             SSI rpi: lamd (SSI v1.0, API v1.0, Module v7.1)
             SSI rpi: sysv (SSI v1.0, API v1.0, Module v7.1)
             SSI rpi: tcp (SSI v1.0, API v1.0, Module v7.1)
             SSI rpi: usysv (SSI v1.0, API v1.0, Module v7.1)
              SSI cr: self (SSI v1.0, API v1.0, Module v1.0)
              Prefix: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf
              Bindir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/bin
              Libdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/lib
              Incdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/include
           Pkglibdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/lib/lam
          Sysconfdir: /fltapps/boeing/cfd/mpi/lam7.1.1_pgf/etc
        Architecture: i686-pc-linux-gnu
       Configured by: borensbs
       Configured on: Tue Nov 16 07:25:19 PST 2004
      Configure host: li13200
      Memory manager: none
          C bindings: yes
        C++ bindings: yes
    Fortran bindings: yes
          C compiler: pgcc
         C char size: 1
         C bool size: 1
        C short size: 2
          C int size: 4
         C long size: 4
        C float size: 4
       C double size: 8
      C pointer size: 4
        C char align: 1
        C bool align: 1
         C int align: 4
       C float align: 4
      C double align: 8
        C++ compiler: pgCC
    Fortran compiler: pgf77
     Fortran symbols: underscore
   Fort integer size: 4
      Fort real size: 4
  Fort dbl prec size: 4
      Fort cplx size: 4
  Fort dbl cplx size: 4
  Fort integer align: 4
     Fort real align: 4
 Fort dbl prec align: 4
     Fort cplx align: 4
 Fort dbl cplx align: 4
         C profiling: yes
       C++ profiling: yes
   Fortran profiling: yes
      C++ exceptions: no
      Thread support: yes
       ROMIO support: yes
        IMPI support: no
       Debug support: no
        Purify clean: no
            SSI base: parameter "verbose" (default value: <none>)
             SSI mpi: parameter "mpi_hostmap" (default value:
                      "/fltapps/boeing/cfd/mpi/lam7.1.1_pgf/etc/lam-hostmap.txt")
            SSI base: parameter "base_module_path" (default value:
                      "/fltapps/boeing/cfd/mpi/lam7.1.1_pgf/lib/lam")
            SSI boot: parameter "boot_verbose" (default value: <none>)
            SSI boot: parameter "boot" (default value: <none>)
            SSI boot: parameter "boot_base_promisc" (default value: "0")
            SSI boot: parameter "boot_base_window_size" (default value: "5")
            SSI boot: parameter "boot_globus_priority" (default value: "3")
            SSI boot: parameter "boot_rsh_username" (default value: <none>)
            SSI boot: parameter "boot_rsh_agent" (default value: "rsh")
            SSI boot: parameter "boot_rsh_no_n" (default value: "0")
            SSI boot: parameter "boot_rsh_no_profile" (default value: "0")
            SSI boot: parameter "boot_rsh_fast" (default value: "0")
            SSI boot: parameter "boot_rsh_ignore_stderr" (default value: "0")
            SSI boot: parameter "boot_rsh_priority" (default value: "10")
            SSI boot: parameter "boot_slurm_priority" (default value: "50")
            SSI boot: parameter "boot_tm_priority" (default value: "75")
            SSI boot: parameter "boot_tm_first" (default value: "-1")
             SSI rpi: parameter "rpi_verbose" (default value: <none>)
             SSI rpi: parameter "rpi" (default value: <none>)
             SSI rpi: parameter "rpi_crtcp_priority" (default value: "25")
             SSI rpi: parameter "rpi_crtcp_short" (default value: "65536")
             SSI rpi: parameter "rpi_crtcp_sockbuf" (default value: "-1")
             SSI rpi: parameter "rpi_lamd_priority" (default value: "20")
             SSI rpi: parameter "rpi_sysv_pollyield" (default value: "1")
             SSI rpi: parameter "rpi_sysv_poolsize" (default value:
                      "16777216")
             SSI rpi: parameter "rpi_sysv_maxalloc" (default value:
                      "1048576")
             SSI rpi: parameter "rpi_sysv_short" (default value: "8192")
             SSI rpi: parameter "rpi_tcp_short" (default value: "65536")
             SSI rpi: parameter "rpi_tcp_sockbuf" (default value: "-1")
             SSI rpi: parameter "rpi_sysv_priority" (default value: "30")
             SSI rpi: parameter "rpi_tcp_priority" (default value: "20")
             SSI rpi: parameter "rpi_usysv_readlockpoll" (default value:
                      "10000")
             SSI rpi: parameter "rpi_usysv_writelockpoll" (default value:
                      "10")
             SSI rpi: parameter "rpi_usysv_pollyield" (default value: "1")
             SSI rpi: parameter "rpi_usysv_poolsize" (default value:
                      "16777216")
             SSI rpi: parameter "rpi_usysv_maxalloc" (default value:
                      "1048576")
             SSI rpi: parameter "rpi_usysv_short" (default value: "8192")
             SSI rpi: parameter "rpi_usysv_priority" (default value: "40")
            SSI coll: parameter "coll_verbose" (default value: <none>)
            SSI coll: parameter "coll_shmem" (default value: "0")
              SSI cr: parameter "cr_verbose" (default value: <none>)
              SSI cr: parameter "cr" (default value: <none>)
              SSI cr: parameter "cr_self_priority" (default value: "25")
              SSI cr: parameter "cr_self_do_restart" (default value: "0")
              SSI cr: parameter "cr_self_prefix" (default value:
                      "lam_cr_self")
              SSI cr: parameter "cr_self_checkpoint" (default value: <none>)
              SSI cr: parameter "cr_self_continue" (default value: <none>)
              SSI cr: parameter "cr_self_restart" (default value: <none>)

Thanx for a great product.

Bernie Borenstein
The Boeing Company