LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-03-21 10:07:27


On Mar 21, 2006, at 9:59 AM, Scott Campbell wrote:

> Thanks a lot Brian and Troy. I am entering into discussion with
> the PBS
> Pro development team on this based on your comments.
>
> Has this ever come up before in the mailing list? I am not seeing
> anything in my search.

Surprisingly, it has never come up before. I don't know if that's
because most people don't pay much attention to file permissions or
if no one figured out it was LAM doing the permissions munging.

Brian

> -----Original Message-----
> From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On
> Behalf
> Of Brian Barrett
> Sent: Monday, March 20, 2006 8:55 PM
> To: General LAM/MPI mailing list
> Subject: Re: LAM: File permissions depend on boot module?
>
> On Mar 20, 2006, at 11:27 AM, Scott Campbell wrote:
>
>> I am seeing that permissions on files created by MPI programs change
>> depending on which boot module I have selected.
>
> <snip>
>
>> My umask is set to 0022.
>
> <snip>
>
>> PBS script:
>>
>> #!/bin/bash
>> echo `date`
>>
>> lamboot -v -ssi boot rsh $PBS_NODEFILE
>> mpirun -np 2 /tmp/a.out
>>
>>
>> Resulting files:
>>
>> -rw-r--r-- 1 user1 g1 0 2006-03-20 12:16 node_1_of_2
>> -rw-r--r-- 1 user1 g1 0 2006-03-20 12:16 node_0_of_2
>>
>> If I change the script to this:
>>
>> #!/bin/bash
>> echo `date`
>>
>> lamboot -v -ssi boot tm $PBS_NODEFILE
>> mpirun -np 2 /tmp/a.out
>>
>>
>> Resulting files:
>>
>> -rw------- 1 user1 g1 0 2006-03-20 12:15 node_1_of_2
>> -rw------- 1 user1 g1 0 2006-03-20 12:15 node_0_of_2
>>
>> Is this by design? I need the files created when using the tm boot
>> module to have the -rw-r--r-- permissions. Can this be
>> configured? If
>> not, any pointers on where in the source code I need to tweak?
>
> This is not completely by design, but is not LAM's doing ;). LAM
> depends on the process starting the lamd to set its umask as
> appropriate. When you use the ssh starter, the remote shell sets the
> umask according to it's default rules, or whatever your
> initialization files set it to. The lamd inherits this umask, and
> that is the umask set when the lamd launches your application. When
> the tm starter is used, the lamd inherits the umask from the pbs mom,
> and that is what is used as the umask when the lamd launches your
> application.
>
> As Troy pointed out, there are a bunch of places in the LAM/MPI
> source code where we set the umask to 077. These are in files under
> <topdir>/otb/sys/, and are all during the early part of lamd
> initialization. However, we store the original umask of the process
> and after the fork() to start the user's process reset the umask to
> that original umask. We feel that for security reasons, the lamd
> should not create files as other users. But this shouldn't be what
> is affecting you, since user applications will have the original
> umask.
>
> One thing I noticed on our small PBS test setup (I believe our setup
> is Torque, but based on what you are seeing I'm willing to bet the
> behavior is the same in PBS Pro) is that the pbs mom appears to
> always start processes with a umask of 077. I wrote a small
> application that just prints the return value of umask() as an octal,
> and running it through pbsdsh in a pbs job gives:
>
> [20:41] brbarret_at_vogon:pts/2 ~> umask
> 22
> [20:41] brbarret_at_vogon:pts/2 ~> cat $PBS_NODEFILE
> vogon.osl.iu.edu
> eddie.osl.iu.edu
> [20:41] brbarret_at_vogon:pts/2 ~> /opt/pbs/bin/pbsdsh $HOME/my_umask
> umask: 77
> umask: 77
>
> Some well-laid printfs in the lamd source code seems to indicate that
> the mom does the same thing there. So it looks like our assumption
> that the starter would always do something sane with things like the
> umask isn't quite right. It works well for rsh/ssh and for SLURM
> (processes started on the allocated nodes have the same umask as the
> process that called srun), but not for PBS. It's possible that we
> could work around this bug and have lamboot propagate the umask for
> PBS, but it would be much easier if PBS could just have the moms
> start processes with an environment similar to that of the process
> calling tm_spawn ;).
>
> Brian
>
>
> --
> Brian Barrett
> LAM/MPI developer and all around nice guy
> Have a LAM/MPI day: http://www.lam-mpi.org/
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/