Dear Jeff ,
Here're some questions I'd like to discuss with you:
>I'm *guessing* that if something goes wrong in the prologue or your
>shell startup files, then PSR won't be run and you won't get a token.
--- Does it need any special requirements on job prologue when using
PSR? As an testing I only use some simplest job scripts such as
"/bin/hostname" or "sleep 5" to submit.
---Why PSR didn't run? How to identify with that? When I submited a
job, I cannot find any "dauthr"and "dauthr_shepherd" process on work
nodes. And it seems PSR really not ran because it returns no token.
....................
>> I tried to start PBS/Torque as root AFTER got the root's token, the
problem >>passed away. But, should I always refresh geting root's
token ? It sounds like >>an irony.
>root should not have an AFS token, right? I'm guessing you can copy
to root's >$HOME because it's not on AFS.
What I really meant is that *after* PBS inherited the AFS admin's
token, it can write job outputs in each users' AFS $HOME. If not, it
cannot work correctly(Surely it can work without AFS). So I guess if I
need to install PBS as root *after* get AFS admin's token or refresh
it periodically?
Thank you very much for your reply. With your help, I think things
would solved quickly then.
Cheers,
Erming.
2005/11/23, Jeff Squyres <jsquyres_at_[hidden]>:
> On Nov 18, 2005, at 9:45 AM, ~{Ea6{Cw~} wrote:
>
> > I've successfully debuged and installed krb4-devel, PSR and
> > OpenPBS-2.3.16 on my server (redhat 7.3).
> > After installation, I made pub key and sec key using "mkpsrauthrkeys"
> > and generated my epass key using "pbspwstore" into $HOME/.psr_authr/
> > directory. But I cannot get my token when I submited interactive jobs.
> > There are some questions puzzles me:
>
> I have to admit that it's been many many years since we developed and
> used this software -- we are no longer at an AFS site, so I have no
> facilities to test and debug anymore. :-(
>
> > 1.How to ensure that evething is working, to submit an interactive job
> > and then run "tokens"?
>
> Yes, this is a good way.
>
> > 2.Is it nesscary submit interactive jobs? Why?
>
> No. This worked for non-interactive jobs as well.
>
> > As an average user, after I login, I got a token simultaneously. But
> > when I run "qsub -I jobname",
> > it reports :
> > [hpcsvr02] /afs/ihep.ac.cn/users/p/pemxz > qsub -I jobi
> > qsub: waiting for job 12.hpcsvr02.ihep.ac.cn to start
> > qsub: job 12.hpcsvr02.ihep.ac.cn ready
> >
> > -bash: [: -: integer expression expected
> > -bash: [: -ge: unary operator expected
>
> Yikes; that doesn't sound right. Can you tell where this is coming
> from? If I recall correctly, PSR was a compiled executable -- this
> looks like a shell script error (e.g., in your prologue, or your shell
> startup files).
>
> > then I run "tokens", returns nothing:
> >
> > [hpcsvr02] /afs/ihep.ac.cn/users/p/pemxz > tokens
> > Tokens held by the Cache Manager:
> >
> > --End of list--
>
> I'm *guessing* that if something goes wrong in the prologue or your
> shell startup files, then PSR won't be run and you won't get a token.
>
> > 3. I tried to submit an non-interactive job. But the status of the job
> > finally shows "E". Then I found the error reports
> > by "qstat -f":
> > ....
> > sched_hint = Post job file processing error; job
> > 15.hpcsvr02.ihep.ac.cn on host hpcsvr02.ihep.ac.cn/0
> >
> > Unable to copy file 15.hpcsvr02.OU to
> > hpcsvr02.ihep.ac.cn:/afs/ihep.ac.cn/users/p/pemxz/jobs.o15
> >>>> error from copy
> > /bin/cp: cannot create regular file
> > `/afs/ihep.ac.cn/users/p/pemxz/jobs.o15': Permission denied
>
> This makes perfect sense -- if you have no token, you'll get all these
> AFS errors.
>
> > I tried to start PBS/Torque as root AFTER got the root's token, the
> > problem pass away. But, should I always refresh geting root's token ?
> > It sounds like an irony.
>
> root should not have an AFS token, right? I'm guessing you can copy to
> root's $HOME because it's not on AFS.
>
> --
> {+} Jeff Squyres
> {+} The Open MPI Project
> {+} http://www.open-mpi.org/
>
>
>
> _______________________________________________
> lam-devel mailing list
> lam-devel_at_[hidden]
> http://www.lam-mpi.org/mailman/listinfo.cgi/lam-devel
>
>
|