-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi there,
I was integrating blcr with LAM/MPI and I'm facing some issues.
when trying to restart a context file with lamrestart i get:
"cri_syscall(CR_OP_RSTRT_REQ, &req): Device or resource busy"
This is what i've done:
- laminfo
LAM/MPI: 7.1.2
Prefix: /opt/lam
Architecture: x86_64-unknown-linux-gnu
Configured by: root
Configured on: Thu Sep 28 15:23:39 WEST 2006
Configure host: XXXXXXXXX
Memory manager: ptmalloc2
C bindings: yes
C++ bindings: yes
Fortran bindings: yes
C compiler: gcc
C++ compiler: g++
Fortran compiler: g77
Fortran symbols: double_underscore
C profiling: yes
C++ profiling: yes
Fortran profiling: yes
C++ exceptions: no
Thread support: yes
ROMIO support: yes
IMPI support: no
Debug support: no
Purify clean: no
SSI boot: globus (API v1.1, Module v0.6)
SSI boot: rsh (API v1.1, Module v1.1)
SSI boot: slurm (API v1.1, Module v1.0)
SSI coll: lam_basic (API v1.1, Module v7.1)
SSI coll: shmem (API v1.1, Module v1.0)
SSI coll: smp (API v1.1, Module v1.2)
SSI rpi: crtcp (API v1.1, Module v1.1)
SSI rpi: lamd (API v1.0, Module v7.1)
SSI rpi: sysv (API v1.0, Module v7.1)
SSI rpi: tcp (API v1.0, Module v7.1)
SSI rpi: usysv (API v1.0, Module v7.1)
SSI cr: blcr (API v1.0, Module v1.1)
SSI cr: self (API v1.0, Module v1.0)
- I've tested blcr with single processes without lam and it work.
- then i try a simple test program.
#include <stdio.h>
#include <mpi.h>
main(int argc, char **argv)
{
int node;
int i, j;
float f;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &node);
printf("Hello World from Node %d.\n", node);
for (j=0; j<=100000; j++)
for(i=0; i<=100000; i++){
f=i*2.718281828*i+i+i*3.141592654;
printf("Interaction [i=%d,j=%d]=%f\n",i,j,f);
}
MPI_Finalize();
}
compiled it with:
mpicc -o hello-lam mpihello.c -L/opt/lam/lib/ -I/opt/lam/include -lmpi
executed with:
mpirun -ssi cr blcr C hello-lam
Created the context file with:
lamcheckpoint -ssi cr blcr -pid 22286 -ssi cr_blcr_base_dir /tmp
Kill the process:
kill -9 22286
Try the restart from context:
lamrestart -ssi cr blcr -ssi cr_blcr_context_file /tmp/context.mpirun.22286
which returns me:
cri_syscall(CR_OP_RSTRT_REQ, &req): Device or resource busy
Any ideas how i could solve this ???
appreciate any help :)
PS: blcr is version blcr-0.4.2
- --
Rui Ramos
==============================================
Universidade do Porto - IRICUP
Praça Gomes Teixeira, 4099-002 Porto, Portugal
email: rramos[at]iric.up.pt
==============================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
iQEVAwUBRRvz271uR0bdnTWSAQKZGQgAsdM+cOMdvT38ysVJdwO/+JN707Fwli07
mNcbpb0UsKIaQbkycbL7B/o/zW5VC45j5Nmkbn0s/v48A+9MPIzQZQ7qE6azU2vG
wL5mMo1bqcHEPusDgNXbLoK7HHVKcGTBBOR7UqXcRyfpUFx/ohJiNp/YutZlIkN/
DXWHd4PW3EXVBgoKn0SUkgIJ8Rk7tE1D2TUTc+7TzqV6lXoxtA6sOoqyFEOZYzjc
UZLDKFG9KF/r8EMIU7/Px0YxJn2kODHFzNq6VmoWmdT27QLfLjqUdvFrciSTNSyt
i5xtfEbHWVaTOFr0QSFVz0mf3wG31Wgg2Wq43kRbbvjAfR4VsqT/uQ==
=BIeF
-----END PGP SIGNATURE-----
|