LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Josh Hursey (jjhursey_at_[hidden])
Date: 2007-03-22 13:20:21


This seems more like a question for the AIX checkpoint/restart team.
You should contact them directly as this list is for LAM/MPI specific
discussion.

Good luck,
Josh

On Mar 13, 2007, at 5:03 AM, rama krishna wrote:

> Hai everybody
>
> Iam using AIX Loadleveler3.1 for checkpointing my simple serial
> application.The problem is while generating ckeckpoint file.It
> generates ckpt file name with extension .err(ckptname.err).when
> restarted_from_ckpt is set to yes in job command file and run the
> job ,the node simply remove the job from the queue and i could not
> get output file.
>
> I am posting my job command file and
> application here.Please say if anybody knows what is the problem
> for not generating ckpt file in correct format,how to debug the
> problem.Tnx in advance
>
>
> My job command file
>
> # For First.c
> # @ job_type = serial
> # @ executable = first
> # @ output = stp.out
> # @ error = stp.err
> # @ class = general
> # @ checkpoint = yes
> # @ restart_from_ckpt = yes
> # @ ckpt_dir = /home/rtsg/crypt/ramakrishna/trial/ex/
> # @ ckpt_file = stp.ckpt
> # @ restart_on_same_nodes = yes
> # @ requirements = Machine == "tf04"
> # @ wall_clock_limit = 5:00:00,4:30:00
> # @ queue
>
> My application
>
> #include<stdio.h>
> #include "llapi.h"
> int main()
> {
> int i;
> LL_ckpt_info ckpt_info;
> cr_error_t cp_error1;
>
> ckpt_info.version = LL_API_VERSION;
> ckpt_info.step_id = NULL;
> ckpt_info.ckptType=NULL;
> ckpt_info.waitType=NULL;
> ckpt_info.abort_sig=NULL;
> ckpt_info.cp_error_data=&cp_error1;
> ckpt_info.ckpt_rc=0;
> ckpt_info.soft_limit=0;
> ckpt_info.hard_limit=0;
> for(i=1;i<4000;i++)
> {
> printf("%d\n",i);
> if(i==2000)
> ll_init_ckpt(&ckpt_info );
> }
> return 0;
> }
>
> Looking for earth-friendly autos?
> Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center.
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/

----
Josh Hursey
jjhursey_at_[hidden]
http://www.open-mpi.org/