LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Yenke Blaise Omer (blaise-omer.yenke_at_[hidden])
Date: 2009-08-19 23:30:42


Thank you very much Mr Josh.

Best regards.

Josh Hursey a écrit :
>
> On Aug 19, 2009, at 3:28 AM, Blaise-Omer.Yenke_at_[hidden] wrote:
>
>> Hi all
>>
>> I'm conducting some experiments to evaluate the checkpointing time of
>> a parallel application with LAM/MPI.
>
> You may want to consider using Open MPI, since LAM/MPI is in
> maintenance mode and no longer being actively developed.
>
>>
>> I'd like to know whether the processes of the application are saved
>> one after another or in parallel, after the synchronization phase.
>
> Once the checkpoint message coordination protocol is finished all of
> the checkpoints are written in parallel from each process in the
> parallel job.
>
>>
>> I'll be greatfull if there is some references.
>
> There is one paper on checkpoint/restart in LAM/MPI, and two on the
> implementation in Open MPI. All can be found at the link below:
> http://osl.iu.edu/publications/Keyword/CHECKPOINTRESTART.php
>
> Best,
> Josh
>
>>
>> Regards.
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>