LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Sébastien Georget (Sebastien.Georget_at_[hidden])
Date: 2004-09-02 10:26:05


Hello,

   I am trying to use the ckecpoint feature of lam.

There is no problem when I try to checkpoint/restart a lam job. The job
seems to be correcly checkpointed :
   mpirun -v -ssi rpi crtcp -ssi cr blcr -np 1 myjob

   cr_checkpoint `pgrep mpirun`
   (no output)
   (context.PID created)

But I have the following error when I try to restart it :
   cr_restart context.PID
   mpirun: Bad file descriptor

Does anybody encountered this problem ?
Are there more documents on checkpointing than the "User's Guide" ?

Sébastien

-- 
Sébastien Georget
INRIA Sophia-Antipolis, Service DREAM, B.P. 93
06902 Sophia-Antipolis Cedex, FRANCE
E-mail : sebastien.georget_at_[hidden]