LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Heiko Bauke (heiko.bauke_at_[hidden])
Date: 2005-03-17 15:28:21


Dear all,

I'm trying to use LAM/MPI 7.1.1 with Berkeley Lab Checkpoint/Restart
0.4.0 and kernel 2.4.26. But I don't get things working correctly. Is
anybody using BLCR to checkpoint MPI applications?

I can checkpoint and restart sequential programs and programs that use
POSIX threads without problems. So, BLCR seams to work.

As described in the User's Guide, I start my MPI programs with

$ mpirun -np 3 -ssi rpi crtcp -ssi cr blcr checkpoint_mpi

When I call cr_checkpoint with the pid of mpirun only a single
checkpoint file with the context of mpirun is saved. But I cannot find
any context files of my applications. I also tried to linked my
application directly to libcr.so, but this did not help.

Has anybody an idea, what I could had made wrong?

        Heiko

P.S.: output of laminfo is:

bauke_at_hal:~ $ laminfo
             LAM/MPI: 7.1.1
              Prefix: /usr
        Architecture: i686-pc-linux-gnu
       Configured by: root
       Configured on: Thu Mar 17 18:09:34 CET 2005
      Configure host: hal
      Memory manager: ptmalloc2
          C bindings: yes
        C++ bindings: yes
    Fortran bindings: yes
          C compiler: gcc
        C++ compiler: g++
    Fortran compiler: g77
     Fortran symbols: double_underscore
         C profiling: yes
       C++ profiling: yes
   Fortran profiling: yes
      C++ exceptions: no
      Thread support: yes
       ROMIO support: yes
        IMPI support: no
       Debug support: no
        Purify clean: no
            SSI boot: globus (API v1.1, Module v0.6)
            SSI boot: rsh (API v1.1, Module v1.1)
            SSI boot: slurm (API v1.1, Module v1.0)
            SSI coll: lam_basic (API v1.1, Module v7.1)
            SSI coll: shmem (API v1.1, Module v1.0)
            SSI coll: smp (API v1.1, Module v1.2)
             SSI rpi: crtcp (API v1.1, Module v1.1)
             SSI rpi: lamd (API v1.0, Module v7.1)
             SSI rpi: sysv (API v1.0, Module v7.1)
             SSI rpi: tcp (API v1.0, Module v7.1)
             SSI rpi: usysv (API v1.0, Module v7.1)
              SSI cr: blcr (API v1.0, Module v1.1)
              SSI cr: self (API v1.0, Module v1.0)

-- 
-- Frauen sind erstaunt, was Männer alles vergessen. Männer 
-- sind erstaunt, woran Frauen sich erinnern. 
-- (Peter Bamm, dt. Schriftsteller 1897-1975)
-- Heiko Bauke @ http://www.uni-magdeburg.de/bauke