On Feb 27, 2006, at 2:32 PM, Jeffrey B. Layton wrote:
> Josh,
>
> I don't see it. Here's the output:
>
> jlayton_at_o1:~> /usr/x86_64-pgi-6.0/lam-7.1.1/bin/laminfo
> LAM/MPI: 7.1.1
> Prefix: /usr/x86_64-pgi-6.0/lam-7.1.1
> Architecture: x86_64-suse-linux-gnu
<snip>
> SSI boot: globus (API v1.1, Module v0.6)
> SSI boot: rsh (API v1.1, Module v1.1)
> SSI boot: slurm (API v1.1, Module v1.0)
<snip>
>
>
> Do you have to have support for PBS to use PBS with LAM?
Yes. LAM has supported PBS for quite a while now. LAM will try to
look for PBS related binaries/libraries in the standard places before
giving up in configure. I bet the binaries/libraries for PBS are in a
non-obvious place, and LAM was not configured with explicit PBS support.
To see how to configure LAM with PBS support take a look at the
section "TM (OpenPBS/PBS Pro/Torque) boot Module" in the LAM/MPI
Installation guide [section 6.4.2] which can be found here:
http://www.lam-mpi.org/using/docs/
From out discussion today it sounds like your sysadmin didn't
install LAM properly, and that is causing the problems you are
seeing. I would ask them to reinstall LAM and ensure they enable the
PBS tm boot module.
I hope this helps a bit,
Josh
>
> Thanks!
>
> Jeff
>
>
>> As a quick followup.
>>
>> You should check the 'laminfo' command looking to see if the 'tm'
>> boot module is available. If not request that your sys admin
>> configure/compile in support for PBS as described in the install
>> documentation.
>>
>> -- Josh
>>
>> On Feb 27, 2006, at 11:29 AM, Josh Hursey wrote:
>>
>>
>>
>>> Jeff,
>>>
>>> It seems that the install is not quite complete or an environment
>>> variable is set improperly.
>>> As a sanity check, make sure you don't have $LAMHOME set on any of
>>> the machines.
>>>
>>> The lamboot problem is likely due to not finding the default
>>> hostfile
>>> [lam-bhost.def] (along with the helpfiles) on one of the machines
>>> (o1). These are installed in $PREFIX/etc (by default /usr/local/
>>> etc).
>>> I would look around on the machine to see if the files [lam-
>>> bhost.def] and [lam-helpfile] are installed properly. If they are in
>>> an odd directory (say /san/lam-7.1.1/etc), you could try setting
>>> $LAMHOME to the root of that directory (/san/lam-7.1.1), and see if
>>> that helps at all.
>>>
>>> As a temporary work around, you could see if lamboot works properly
>>> with a local hostfile:
>>> $ cat my-bhost.def
>>> localhost
>>> $ lamboot -v my-bhost.def
>>>
>>> -- Josh
>>>
>>> On Feb 27, 2006, at 10:46 AM, Jeffrey B. Layton wrote:
>>>
>>>
>>>
>>>> Hello,
>>>>
>>>> I'm trying to run a code built with PGI 6.0 and LAM-7.1.1
>>>> on an Opteron system (SLES 9, SP2). The code builds
>>>> correctly, but when I try to lamboot I get the following
>>>> error message:
>>>>
>>>> n-1<29905> ssi:boot:base:linear: booting n0 (o1)
>>>> base: cannot find process schema (null): No such file or directory
>>>> -------------------------------------------------------------------
>>>> --
>>>> -
>>>> -------
>>>>
>>>> *** Oops -- I cannot open the LAM help file.
>>>> *** I tried looking for it in the following places:
>>>> ***
>>>> *** $HOME/lam-helpfile
>>>> *** $HOME/lam-7.1.1-helpfile
>>>> *** $HOME/etc/lam-helpfile
>>>> *** $HOME/etc/lam-7.1.1-helpfile
>>>> *** $LAMHELPDIR/lam-helpfile
>>>> *** $LAMHELPDIR/lam-7.1.1-helpfile
>>>> *** $LAMHOME/etc/lam-helpfile
>>>> *** $LAMHOME/etc/lam-7.1.1-helpfile
>>>> *** $SYSCONFDIR/lam-helpfile
>>>> *** $SYSCONFDIR/lam-7.1.1-helpfile
>>>> ***
>>>> *** You were supposed to get help on the program "hboot"
>>>> *** about the topic "cant-parse-config"
>>>> ***
>>>> *** Sorry!
>>>> -------------------------------------------------------------------
>>>> --
>>>> -
>>>> -------
>>>>
>>>>
>>>>
>>>> So I assume something is wrong and I try using recon to see what's
>>>> going on. Here is the output from the first node:
>>>>
>>>>
>>>> n-1<25563> ssi:boot:base:linear: booting n0 (o1)
>>>> n-1<25563> ssi:boot:base:linear: Failed to boot n0 (o1)
>>>> n-1<25563> ssi:boot:base:linear: aborted!
>>>> -------------------------------------------------------------------
>>>> --
>>>> -
>>>> -------
>>>> *** Oops -- I cannot open the LAM help file.
>>>> *** I tried looking for it in the following places:
>>>> ***
>>>> *** $HOME/lam-helpfile
>>>> *** $HOME/lam-7.1.1-helpfile
>>>> *** $HOME/etc/lam-helpfile
>>>> *** $HOME/etc/lam-7.1.1-helpfile
>>>> *** $LAMHELPDIR/lam-helpfile
>>>> *** $LAMHELPDIR/lam-7.1.1-helpfile
>>>> *** $LAMHOME/etc/lam-helpfile
>>>> *** $LAMHOME/etc/lam-7.1.1-helpfile
>>>> *** $SYSCONFDIR/lam-helpfile
>>>> *** $SYSCONFDIR/lam-7.1.1-helpfile
>>>> ***
>>>> *** You were supposed to get help on the program "recon"
>>>> *** about the topic "unhappiness"
>>>> ***
>>>> *** Sorry!
>>>> -------------------------------------------------------------------
>>>> --
>>>> -
>>>> -------
>>>>
>>>>
>>>> I assume something is wrong with the installation. Any ideas?
>>>> (I didn't do the build nor the installation).
>>>>
>>>> Thanks!
>>>>
>>>> Jeff
>>>>
>>>>
>>>> _______________________________________________
>>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>>
>>>>
>>> ----
>>> Josh Hursey
>>> jjhursey_at_[hidden]
>>> http://www.lam-mpi.org/
>>>
>>> _______________________________________________
>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>
>>>
>>
>> ----
>> Josh Hursey
>> jjhursey_at_[hidden]
>> http://www.lam-mpi.org/
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>>
>>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
----
Josh Hursey
jjhursey_at_[hidden]
http://www.lam-mpi.org/
|