LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2004-07-19 10:47:30


This looks like the classic output of LAM when you have a firewall
running on that host. You might want to bug your sysadmins - lamboot
opens a random port, tells hboot the port number, then hboot tries to
connect back. It's failing on the connect back part.

Brian

On Jul 19, 2004, at 10:39 AM, Imran Ahmed khan wrote:

> When i write -d i m getting this error :
>
> n-1<3529> ssi:boot: Opening
> n-1<3529> ssi:boot: opening module globus
> n-1<3529> ssi:boot: initializing module globus
> n-1<3529> ssi:boot:globus: globus-job-run not found, globus boot will
> not run
> n-1<3529> ssi:boot: module not available: globus
> n-1<3529> ssi:boot: opening module rsh
> n-1<3529> ssi:boot: initializing module rsh
> n-1<3529> ssi:boot:rsh: module initializing
> n-1<3529> ssi:boot:rsh:agent: rsh
> n-1<3529> ssi:boot:rsh:username: <same>
> n-1<3529> ssi:boot:rsh:verbose: 1000
> n-1<3529> ssi:boot:rsh:algorithm: linear
> n-1<3529> ssi:boot:rsh:priority: 10
> n-1<3529> ssi:boot: module available: rsh, priority: 10
> n-1<3529> ssi:boot: finalizing module globus
> n-1<3529> ssi:boot:globus: finalizing
> n-1<3529> ssi:boot: closing module globus
> n-1<3529> ssi:boot: Selected boot module rsh
>
> LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
>
> n-1<3529> ssi:boot:base: looking for boot schema in following
> directories:
> n-1<3529> ssi:boot:base: <current directory>
> n-1<3529> ssi:boot:base: $TROLLIUSHOME/etc
> n-1<3529> ssi:boot:base: $LAMHOME/etc
> n-1<3529> ssi:boot:base: /lam/etc
> n-1<3529> ssi:boot:base: looking for boot schema file:
> n-1<3529> ssi:boot:base: hostfile
> n-1<3529> ssi:boot:base: found boot schema: hostfile
> n-1<3529> ssi:boot:rsh: found the following hosts:
> n-1<3529> ssi:boot:rsh: n0 192.168.11.27 (cpu=1)
> n-1<3529> ssi:boot:rsh: resolved hosts:
> n-1<3529> ssi:boot:rsh: n0 192.168.11.27 --> 192.168.11.27 (origin)
> n-1<3529> ssi:boot:rsh: starting RTE procs
> n-1<3529> ssi:boot:base:linear: starting
> n-1<3529> ssi:boot:base:server: opening server TCP socket
> n-1<3529> ssi:boot:base:server: opened port 32823
> n-1<3529> ssi:boot:base:linear: booting n0 (192.168.11.27)
> n-1<3529> ssi:boot:rsh: starting lamd on (192.168.11.27)
> n-1<3529> ssi:boot:rsh: starting on n0 (192.168.11.27): hboot -t -c
> lam-conf.lamd -d -I -H 192.168.11.27 -P 32823 -n 0 -o 0
> n-1<3529> ssi:boot:rsh: launching locally
> n-1<3529> ssi:boot:base:linear: Failed to boot n0 (192.168.11.27)
> n-1<3529> ssi:boot:base:server: closing server socket
> n-1<3529> ssi:boot:base:linear: aborted!
> -----------------------------------------------------------------------
> ------
> lamboot encountered some error (see above) during the boot process,
> and will now attempt to kill all nodes that it was previously able to
> boot (if any).
>
> Please wait for LAM to finish; if you interrupt this process, you may
> have LAM daemons still running on remote nodes.
> -----------------------------------------------------------------------
> ------
> lamboot: wipe -- nothing to do
> lamboot did NOT complete successfully
>
> waiting for u reply
> Thanx
> imran
>
>
>> From: Neil Storer <Neil.Storer_at_[hidden]>
>> Reply-To: Neil.Storer_at_[hidden]
>> To: imranahmedkhan82_at_[hidden]
>> Subject: Re: LAM: LAMBoot error, plz Help
>> Date: Mon, 19 Jul 2004 16:00:38 +0100
>>
>> Imran,
>>
>> Try using the "-d" option on "lamboot" to debug why it is going wrong.
>>
>> Have a look at the FAQ (Frequently Asked Questions) at:
>>
>> http://www.lam-mpi.org/faq/
>>
>> for tips as to what might be the cause. In particular look at secton
>> 4 (Booting LAM):
>>
>> http://www.lam-mpi.org/faq/category4.php3
>>
>> Regards
>> Neil
>>
>> Imran Ahmed khan wrote:
>>> Hi,
>>>
>>> I have installed LAM-7.0.6 successfullly but when i try to run
>>> lamboot ,its giving me error :
>>>
>>> i wrote : lamboot -v hostfile
>>>
>>> it gives error :
>>>
>>> LAM 7.0.6/MPI 2 C++/ROMIO - Indiana University
>>>
>>> n-1<2745> ssi:boot:base:linear: booting n0 (ntws527.ssuet.edu.pk)
>>> n-1<2745> ssi:boot:base:linear: Failed to boot n0
>>> (ntws527.ssuet.edu.pk)
>>> n-1<2745> ssi:boot:base:linear: aborted!
>>> ---------------------------------------------------------------------
>>> --------
>>>
>>> lamboot encountered some error (see above) during the boot process,
>>> and will now attempt to kill all nodes that it was previously able to
>>> boot (if any).
>>>
>>> Please wait for LAM to finish; if you interrupt this process, you may
>>>
>>> have LAM daemons still running on remote nodes.
>>> ---------------------------------------------------------------------
>>> --------
>>>
>>> lamboot: wipe -- nothing to do
>>> lamboot did NOT complete successfully
>>>
>>>
>>> our host file contains :
>>> ntws527.ssuet.edu.pk
>>>
>>> (i also write the IP ) but same error.
>>>
>>> We are Waiting for ur reply.
>>> Thanx
>>> Imran
>>>
>>> _______________________________________________
>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>> --
>> +-----------------+---------------------------------
>> +------------------+
>> | Neil Storer | Head: Systems S/W Section | Operations
>> Dept. |
>> +-----------------+---------------------------------
>> +------------------+
>> | ECMWF, | email: neil.storer_at_[hidden] | //=\\ //=\\
>> |
>> | Shinfield Park, | Tel: (+44 118) 9499353 | // \\//
>> \\ |
>> | Reading, | (+44 118) 9499000 x 2353 | ECMWF
>> |
>> | Berkshire, | Fax: (+44 118) 9869450 | ECMWF
>> |
>> | RG2 9AX, | | \\ //\\
>> // |
>> | UK | URL: http://www.ecmwf.int/ | \\=// \\=//
>> |
>> +--+--------------+---------------------------------+----------------
>> +-+
>> | ECMWF is the European Centre for Medium-Range Weather Forecasts |
>> +-----------------------------------------------------------------+
>>
>
> _________________________________________________________________
> MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.
> http://join.msn.com/?page=features/virus
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have a LAM/MPI day: http://www.lam-mpi.org/