LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Prabhanjan Kambadur (pkambadu_at_[hidden])
Date: 2004-04-21 18:05:41


Hi,

Sorry for the delayed reply. In LAM, we implement a modified form of the
CL algorithm (crtcp module). In a broad sense, the CL algorithm in a
specifies that we save the state of the processes and the network so that
it can be restored at a later point in time. We do the same but instead of
saving the state of the network, we drain it. However, we are not aware of
any implementation which follows the exact protocol specified in the CL
algorithm. Also note that most implementations modify CL algorithm since
CL is not scalable when the number of processes (n) is very large and the
network is only "virtually" fully connected as opposed to being
"physically fully connected".

Hope this helps,
Anju.

On Sat, 17 Apr 2004, Der Herr Hofrat wrote:

> > Hi all,
> > Is there an implementation of the Chandy & Lamport's distributed
> > snapshot algorithm using the BLCR package in LAM ?
> >
> > If yes, from where i can get it??
> >
> don't know if this is of any help - but it was implemented
> for MPICH on Linux-2.4.18 - so thats fairly recent - given
> the layering of lam with the CR-SSI available - it might not
> be that much an effort to wrape it up into a CR-module.
>
> a paper on this is at http://www.lri.fr/~gk/MPICH-V/papers/Cluster2003.pdf
> and the V-CL is available in the download section of MPICH-V.
>
> hofrat
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>