LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Scott Campbell (scc_at_[hidden])
Date: 2006-03-21 09:59:42


Thanks a lot Brian and Troy. I am entering into discussion with the PBS
Pro development team on this based on your comments.

Has this ever come up before in the mailing list? I am not seeing
anything in my search.

--
Scott Campbell
PBS Professional Support Engineer
Phone:   248-614-2400 ext. 585
Email:   scc_at_[hidden]
Technical Support Hotline: 248-614-2425
Technical Support Email:   pbssupport_at_[hidden]
-----Original Message-----
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
Of Brian Barrett
Sent: Monday, March 20, 2006 8:55 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: File permissions depend on boot module?
On Mar 20, 2006, at 11:27 AM, Scott Campbell wrote:
> I am seeing that permissions on files created by MPI programs change
> depending on which boot module I have selected.
<snip>
> My umask is set to 0022.
<snip>
> PBS script:
>
> #!/bin/bash
> echo `date`
>
> lamboot -v -ssi boot rsh $PBS_NODEFILE
> mpirun -np 2 /tmp/a.out
>
>
> Resulting files:
>
> -rw-r--r--   1 user1 g1           0 2006-03-20 12:16 node_1_of_2
> -rw-r--r--   1 user1 g1           0 2006-03-20 12:16 node_0_of_2
>
> If I change the script to this:
>
> #!/bin/bash
> echo `date`
>
> lamboot -v -ssi boot tm $PBS_NODEFILE
> mpirun -np 2 /tmp/a.out
>
>
> Resulting files:
>
> -rw-------   1 user1 g1           0 2006-03-20 12:15 node_1_of_2
> -rw-------   1 user1 g1           0 2006-03-20 12:15 node_0_of_2
>
> Is this by design?  I need the files created when using the tm boot
> module to have the -rw-r--r-- permissions.  Can this be  
> configured?  If
> not, any pointers on where in the source code I need to tweak?
This is not completely by design, but is not LAM's doing ;).  LAM  
depends on the process starting the lamd to set its umask as  
appropriate.  When you use the ssh starter, the remote shell sets the  
umask according to it's default rules, or whatever your  
initialization files set it to.  The lamd inherits this umask, and  
that is the umask set when the lamd launches your application.  When  
the tm starter is used, the lamd inherits the umask from the pbs mom,  
and that is what is used as the umask when the lamd launches your  
application.
As Troy pointed out, there are a bunch of places in the LAM/MPI  
source code where we set the umask to 077.  These are in files under  
<topdir>/otb/sys/, and are all during the early part of lamd  
initialization.  However, we store the original umask of the process  
and after the fork() to start the user's process reset the umask to  
that original umask.  We feel that for security reasons, the lamd  
should not create files as other users.  But this shouldn't be what  
is affecting you, since user applications will have the original umask.
One thing I noticed on our small PBS test setup (I believe our setup  
is Torque, but based on what you are seeing I'm willing to bet the  
behavior is the same in PBS Pro) is that the pbs mom appears to  
always start processes with a umask of 077.  I wrote a small  
application that just prints the return value of umask() as an octal,  
and running it through pbsdsh in a pbs job gives:
[20:41] brbarret_at_vogon:pts/2 ~> umask
22
[20:41] brbarret_at_vogon:pts/2 ~> cat $PBS_NODEFILE
vogon.osl.iu.edu
eddie.osl.iu.edu
[20:41] brbarret_at_vogon:pts/2 ~> /opt/pbs/bin/pbsdsh $HOME/my_umask
umask: 77
umask: 77
Some well-laid printfs in the lamd source code seems to indicate that  
the mom does the same thing there.  So it looks like our assumption  
that the starter would always do something sane with things like the  
umask isn't quite right.  It works well for rsh/ssh and for SLURM  
(processes started on the allocated nodes have the same umask as the  
process that called srun), but not for PBS.  It's possible that we  
could work around this bug and have lamboot propagate the umask for  
PBS, but it would be much easier if PBS could just have the moms  
start processes with an environment similar to that of the process  
calling tm_spawn ;).
Brian
-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have a LAM/MPI day: http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/