We using the LAM/MPI 7.0.4 version and intel c/c++ compiler 8.0 together
for materialize hybrid model.
When we install LAM/MPI we using the two options which are --with-cc=icc,
--with-cxx=icc.
After inatalled, we found old version of lamboot(maybe it was 6.0.5
version), so we deleted it and reinstall new version of lamboot(7.0.4 free
version)
But still we have a problem here.
It was very good work with old version but after we installed new version
of lamboot it didn't work well.
So we want to know why it happen and we want to know how we can solve this
problem
Iet me know you my computer condition
CPU: intel p4 processor
RAM: 512MB RDRAM
and error massages are:
[mageshin_at_TEAM11-4 mageshin]$ lamboot -v -d lamhost
n0<5682> ssi:boot: Opening
n0<5682> ssi:boot: opening module globus
n0<5682> ssi:boot: initializing module globus
n0<5682> ssi:boot:globus: globus-job-run not found, globus boot will not
run
n0<5682> ssi:boot: module not available: globus
n0<5682> ssi:boot: opening module rsh
n0<5682> ssi:boot: initializing module rsh
n0<5682> ssi:boot:rsh: module initializing
n0<5682> ssi:boot:rsh:agent: rsh
n0<5682> ssi:boot:rsh:username: <same>
n0<5682> ssi:boot:rsh:verbose: 1000
n0<5682> ssi:boot:rsh:algorithm: linear
n0<5682> ssi:boot:rsh:priority: 10
n0<5682> ssi:boot: module available: rsh, priority: 10
n0<5682> ssi:boot: finalizing module globus
n0<5682> ssi:boot:globus: finalizing
n0<5682> ssi:boot: closing module globus
n0<5682> ssi:boot: Selected boot module rsh
LAM 7.0.4/MPI 2 C++/ROMIO - Indiana University
n0<5682> ssi:boot:base: looking for boot schema in following directories:
n0<5682> ssi:boot:base: <current directory>
n0<5682> ssi:boot:base: $TROLLIUSHOME/etc
n0<5682> ssi:boot:base: $LAMHOME/etc
n0<5682> ssi:boot:base: /usr/local/share/lam/etc
n0<5682> ssi:boot:base: looking for boot schema file:
n0<5682> ssi:boot:base: lamhost
n0<5682> ssi:boot:base: found boot schema: lamhost
n0<5682> ssi:boot:rsh: found the following hosts:
n0<5682> ssi:boot:rsh: n0 TEAM11-4 (cpu=1)
n0<5682> ssi:boot:rsh: n1 TEAM11-2 (cpu=1)
n0<5682> ssi:boot:rsh: resolved hosts:
n0<5682> ssi:boot:rsh: n0 TEAM11-4 --> 210.123.39.243 (origin)
n0<5682> ssi:boot:rsh: n1 TEAM11-2 --> 210.123.39.167
n0<5682> ssi:boot:rsh: starting RTE procs
n0<5682> ssi:boot:base:linear: starting
n0<5682> ssi:boot:base:server: opening server TCP socket
n0<5682> ssi:boot:base:server: opened port 33171
n0<5682> ssi:boot:base:linear: booting n0 (TEAM11-4)
n0<5682> ssi:boot:rsh: starting lamd on (TEAM11-4)
n0<5682> ssi:boot:rsh: starting on n0 (TEAM11-4): hboot -t -c lam-conf.lamd
-d -v -I -H 210.123.39.243 -P 33171 -n 0 -o 0
n0<5682> ssi:boot:rsh: launching locally
hboot: process schema = "lam-conf.lamd"
hboot: found /usr/bin/lamd
hboot: performing tkill
hboot: tkill
hboot: booting...
hboot: fork /usr/bin/lamd
hboot: attempting to execute
[1] 5685 lamd -H 210.123.39.243 -P 33171 -n 0 -o 0 -d
n0<5682> ssi:boot:rsh: successfully launched on n0 (TEAM11-4)
n0<5682> ssi:boot:base:server: expecting connection from finite list
n0<5682> ssi:boot:base:server: got connection from 210.123.39.243
n0<5682> ssi:boot:base:server: this connection is expected (n0)
-----------------------------------------------------------------------------
The lamboot agent failed to read a message over a socket from the
newly-booted process. This should not happen (especially since TCP is
a guaranteed protocol).
Please check your network connectivity and ensure that messages can be
passed reliably over TCP. Additionally, ensure that the host where
the newly-booted process was launched is healthy and still available
on the network.
-----------------------------------------------------------------------------
n0<5682> ssi:boot:base:server: failed to connect to remote lamd!
n0<5682> ssi:boot:base:server: closing server socket
n0<5682> ssi:boot:base:linear: aborted!
-----------------------------------------------------------------------------
lamboot encountered some error (see above) during the boot process,
and will now attempt to kill all nodes that it was previously able to
boot (if any).
Please wait for LAM to finish; if you interrupt this process, you may
have LAM daemons still running on remote nodes.
-----------------------------------------------------------------------------
lamboot: wipe -- nothing to do
lamboot did NOT complete successfully
_________________________________________________________________
MSN Messenger¸¦ ÅëÇØ ¿Â¶óÀÎ»ó¿¡ Àִ ģ±¸¿Í ´ëȸ¦ ³ª´©¼¼¿ä.
http://messenger.msn.co.kr
|