LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jerry Mersel (jerry.mersel_at_[hidden])
Date: 2008-12-08 04:05:30


Hi:

  You should either downgrade to 0.6.4 or upgrade to the
  beta version of 0.8.0. There is a bug that doesn't
  allow 0.7.0 and 0.7.3 to checkpoint.

                Regards,
                  Jerry

> I run my program using
> mpirun -ssi rpi crtcp -ssi cr blcr -np 2 ./hello
> then I use
> lamcheckpoint -ssi cr blcr -pid mpirun_pid
> and get following error message:
> -----------------------------------------------------------------------
> Encountered a failure in the SSI types while continuing from
> checkpoint. Aborting in despair :-(
> -----------------------------------------------------------------------
> rpwait failed: Success
>
> Checkpoint failed: no process checkpointed.
>
> No checkpoint file is created, process is terminated on the node,
> where lamcheckpoint was invoked and both process and two examples of
> cr_checkpoint are in process list of second node.
>
> LAM-MPI version is 7.1.4
> BLCR version is 0.7.3
> BLCR is configured with --enable-static and --enable-all-static
> OS - CentiOS 5
> --
> With best regards
> Gleb "Crazy Sage" Igumnov
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>