LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Mahmoud Payami (mpayami_at_[hidden])
Date: 2006-04-26 00:49:06


Dear Josh,
Thank you for your help. Actually I used this tip in your previous message
and got successful result.
Now, I have problem in Lamtests. The "ccl" and "dynamic" fail but remaining
PASS the test. The following messages appear in ccl:

===================================
[mahmoud_at_condmat1 ~]$ lamboot -v -ssi boot rsh hostfile

LAM 7.1.2/MPI 2 C++/ROMIO - Indiana University

n-1<14181> ssi:boot:base:linear: booting n0 (condmat1.ctpm.aeoi.org)
n-1<14181> ssi:boot:base:linear: booting n1 (condmat10.ctpm.aeoi.org)
n-1<14181> ssi:boot:base:linear: finished
[mahmoud_at_condmat1 ccl]$ make check
Making check in intercomm
make[1]: Entering directory `/home/mahmoud/lamtests-7.1.2/ccl/intercomm'
make check-TESTS
make[2]: Entering directory `/home/mahmoud/lamtests-7.1.2/ccl/intercomm'
mpirun -x TEST -ssi cr none -s h C -ssi rpi crtcp
/home/mahmoud/lamtests-7.1.2/ccl/intercomm/./allgather_inter
MPI_Comm_accept: unclassified: Bad address (rank 0, comm 4)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Comm_accept: unclassified: Bad address (rank 1, comm 4)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (2, MPI_COMM_WORLD): - MPI_Comm_accept()

****************system hangs and I use CTRL-C to exit.
===========================================================

...and the following message appear in checking "dynamic":

====================================
[mahmoud_at_condmat1 ~]$ lamboot -v -ssi boot rsh hostfile

LAM 7.1.2/MPI 2 C++/ROMIO - Indiana University

n-1<13601> ssi:boot:base:linear: booting n0 (condmat1.ctpm.aeoi.org)
n-1<13601> ssi:boot:base:linear: booting n1 (condmat10.ctpm.aeoi.org)
n-1<13601> ssi:boot:base:linear: finished
[mahmoud_at_condmat1 ~]$ mc
[mahmoud_at_condmat1 dynamic]$ make check
make check-TESTS
make[1]: Entering directory `/home/mahmoud/lamtests-7.1.2/dynamic'
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi crtcp
/home/mahmoud/lamtests-7 .1.2/dynamic/./spawn
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi lamd
/home/mahmoud/lamtests-7. 1.2/dynamic/./spawn
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi sysv
/home/mahmoud/lamtests-7. 1.2/dynamic/./spawn
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi tcp
/home/mahmoud/lamtests-7.1 .2/dynamic/./spawn
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi usysv
/home/mahmoud/lamtests-7.1.2/dynamic/./spawn
PASS: spawn
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi crtcp
/home/mahmoud/lamtests-7.1.2/dynamic/./spawn_multiple
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi lamd
/home/mahmoud/lamtests-7.1.2/dynamic/./spawn_multiple
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi sysv
/home/mahmoud/lamtests-7.1.2/dynamic/./spawn_multiple
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi tcp
/home/mahmoud/lamtests-7.1.2/dynamic/./spawn_multiple
mpirun -x TEST -ssi cr none -s h N -np 3 -ssi rpi usysv
/home/mahmoud/lamtests-7.1.2/dynamic/./spawn_multiple
PASS: spawn_multiple
mpirun -x TEST -ssi cr none -s h C -ssi rpi crtcp
/home/mahmoud/lamtests-7.1.2/dynamic/./client_server
MPI_Comm_accept: unclassified: Bad address (rank 0, MPI_COMM_SELF)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (0, MPI_COMM_WORLD): - main()
bufferd (getroute): invalid node
                                ERROR: mpirun/client_server/-ssi rpi crtcp
returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi lamd
/home/mahmoud/lamtests-7.1.2/dynamic/./client_server
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/client_server/-ssi rpi lamd returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi sysv
/home/mahmoud/lamtests-7.1.2/dynamic/./client_server
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/client_server/-ssi rpi sysv returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi tcp
/home/mahmoud/lamtests-7.1.2/dynamic/./client_server
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/client_server/-ssi rpi tcp returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi usysv
/home/mahmoud/lamtests-7.1.2/dynamic/./client_server
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/client_server/-ssi rpi usysv returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
FAIL: client_server
mpirun -x TEST -ssi cr none -s h C -ssi rpi crtcp
/home/mahmoud/lamtests-7.1.2/dynamic/./comm_join
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/comm_join/-ssi rpi crtcp returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi lamd
/home/mahmoud/lamtests-7.1.2/dynamic/./comm_join
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/comm_join/-ssi rpi lamd returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi sysv
/home/mahmoud/lamtests-7.1.2/dynamic/./comm_join
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/comm_join/-ssi rpi sysv returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi tcp
/home/mahmoud/lamtests-7.1.2/dynamic/./comm_join
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/comm_join/-ssi rpi tcp returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
mpirun -x TEST -ssi cr none -s h C -ssi rpi usysv
/home/mahmoud/lamtests-7.1.2/dynamic/./comm_join
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
ERROR: mpirun/comm_join/-ssi rpi usysv returned nonzero status
-----------------------------------------------------------------------------
It seems that there is no lamd running on the host condmat1.ctpm.aeoi.org.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "lamclean" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment. See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
-----------------------------------------------------------------------------
Warning: "post mpirun" command returned nonzero status
FAIL: comm_join
===================
2 of 4 tests failed
===================
make[1]: *** [check-TESTS] Error 1
make[1]: Leaving directory `/home/mahmoud/lamtests-7.1.2/dynamic'
make: *** [check-am] Error 2
[mahmoud_at_condmat1 dynamic]$
========================================

> Sorry for the delay in my reply.
>
> You should be able to set the default ssh command by setting the "LAMRSH"
> environment variable appropriately. So something like:
> export LAMRSH=/usr/bin/ssh
>
> Hope that helps.
>
> -- Josh