Hi;
I'm trying to do some tests using blcr (checkpoint lib),
mpi and 64 bits architectures but I'm having some
problems.
First of all, the lamtest suite doesn't pass! I got the
following error message after configure, make and
make -k check:
mpirun -x TEST -s h C -ssi rpi crtcp
/home/asc/lamtests-7.1.1/ccl/intercomm/./allgather_inter
MPI_Comm_accept: unclassified: Bad address (rank 0, comm 2)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Comm_accept: unclassified: Bad address (rank 2, comm 2)
Rank (4, MPI_COMM_WORLD): Call stack within LAM:
Rank (4, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (4, MPI_COMM_WORLD): - main()
MPI_Comm_accept: unclassified: Bad address (rank 4, comm 2)
Rank (8, MPI_COMM_WORLD): Call stack within LAM:
Rank (8, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (8, MPI_COMM_WORLD): - main()
MPI_Comm_accept: unclassified: Bad address (rank 5, comm 2)
Rank (10, MPI_COMM_WORLD): Call stack within LAM:
Rank (10, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (10, MPI_COMM_WORLD): - main()
MPI_Comm_accept: unclassified: Bad address (rank 3, comm 2)
Rank (6, MPI_COMM_WORLD): Call stack within LAM:
Rank (6, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (6, MPI_COMM_WORLD): - main()
MPI_Comm_accept: unclassified: Bad address (rank 1, comm 2)
Rank (2, MPI_COMM_WORLD): Call stack within LAM:
Rank (2, MPI_COMM_WORLD): - MPI_Comm_accept()
Rank (2, MPI_COMM_WORLD): - main()
make[3]: *** [check-TESTS] Interrupt
make[2]: *** [check-am] Interrupt
make[1]: *** [check-recursive] Interrupt
make: *** [check-recursive] Interrupt
My system is a debian running:
kernel 2.6.12.1
blcr 0.4.2
lammpi 7.1.2b31
compiled with gcc-4.0.3 (and all stuff updated by apt-get)
My hostfile has just one line:
localhost cpu=12
The information returned by laminfo is:
asc_at_gdx0148:~/lamtests-7.1.1$ laminfo
LAM/MPI: 7.1.2b31
Prefix: /usr/local
Architecture: x86_64-unknown-linux-gnu
Configured by: root
Configured on: Wed Feb 15 11:24:58 CET 2006
Configure host: gdx0148.orsay.grid5000.fr
Memory manager: ptmalloc2
C bindings: yes
C++ bindings: yes
Fortran bindings: no
C compiler: gcc
C++ compiler: g++
Fortran compiler: false
Fortran symbols: none
C profiling: yes
C++ profiling: yes
Fortran profiling: no
C++ exceptions: no
Thread support: yes
ROMIO support: yes
IMPI support: no
Debug support: no
Purify clean: no
SSI boot: globus (API v1.1, Module v0.6)
SSI boot: rsh (API v1.1, Module v1.1)
SSI boot: slurm (API v1.1, Module v1.0)
SSI coll: lam_basic (API v1.1, Module v7.1)
SSI coll: shmem (API v1.1, Module v1.0)
SSI coll: smp (API v1.1, Module v1.2)
SSI rpi: crtcp (API v1.1, Module v1.1)
SSI rpi: lamd (API v1.0, Module v7.1)
SSI rpi: sysv (API v1.0, Module v7.1)
SSI rpi: tcp (API v1.0, Module v7.1)
SSI rpi: usysv (API v1.0, Module v7.1)
SSI cr: blcr (API v1.0, Module v1.1)
SSI cr: self (API v1.0, Module v1.0)
LAM was configured using:
./configure --without-fc --with-rpi=crtcp --with-cr-blcr=/usr/local
--with-threads=posix
And blcr seems to be OK since I tested it on some sequential
code and I could stop and restart them without problems.
Any ideas? I looked for many system options and I couldn´t
find anything wrong....
Thanks a lot for any help.
ASC
--
___________________________________________________________________
CARISSIMI, Alexandre Universidade Federal do Rio Grande do Sul
asc_at_[hidden] Instituto de Informática
Tel: +55.51.33.16.61.69 Caixa Postal 15064
Fax: +55.51.33.16.73.08 CEP:91501-970 Porto Alegre - RS - Brasil
___________________________________________________________________
|