LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Andrew Friedley (afriedle_at_[hidden])
Date: 2006-03-17 13:22:58


Nobuhiro KUSUNO wrote:
> I want to learn the usage of checkpoint/restart.
> But, I don't find out the web sites mentiond it.
> Regretfully, I don't understand the libssi_cr system enough from the PDF
> technical document.

That's probably not the kind of document you are after.

> So...
> Can I access the example source codes for "-ssi cr self" and tutorials of
> the usage of libssi_cr and lamchekcpoint commands etc.

I've found that section 9.5 of the LAM User's Manual discusses use of
checkpoint/restart:

http://www.lam-mpi.org/download/files/7.1.2-user.pdf

Also, some LAM man pages might be of use to you:

lamssi_cr(7)
lamcheckpoint(1)
lamrestart(1)

These are available with LAM itself and can be found via your favorite
search engine as well.

Although the blcr module is likely what you want, the self module looks
fairly easy to use (I have no personal experience with it myself), in
that the application defines three functions that are called upon
checkpoint, restart, and continue. What these functions need to do will
be specific to your application - I can't help you there.

Andrew