LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2003-04-23 14:48:02


On Wed, 23 Apr 2003, Irv Elshoff wrote:

> At our institute we use UNIX on a variety of platforms. To shield users
> from system (e.g., Linux, IRIX, OS/X, etc.) differences we have
> developed our own common shell environment, basically a big dot-profile
> shared by all users - but with hooks for customization - that sorts out
> things so that end users get the same "look and feel" on all systems.
> This POSIX shell based environment does not (or need not) conform to LAM
> requirements (e.g., it can produce output on stderr, among other
> things.)

Good point -- it is true that POSIX says that putting things out on stderr
is not necessarily a bad thing. It is admittedly a hueristic in LAM that
we treat anything on stderr as an error.

> [snipped]
> Specifically, when lamboot (and recon) starts a daemon on a remote host
> it does not provide any information in the shell variable environment to
> indicate that LAM is starting. Hence we cannot disable the
> LAM-unnecessary and/or unfriendly parts of .profile (or .login, .cshrc,
> .bashunderscorewhaterver).
> [snipped]
> As a newbie to LAM I would have expected that this problem would have
> been addressed in the LAM framework. For example, by a setting in the
> config file, or by a command-line argument (to lamboot/recon), or by
> default. Perhaps I've missed something in the documentation, in which
> case I'd appreciate a pointer.

Nope -- honestly, you're the first one to ask in such detail. :-)

We have always treated info that comes out on stderr as an error that
should be fixed.

> Otherwise, I'd like to suggest to the LAM community defining a shell
> variable - the name and value are insignificant if unique - whenever any
> shell-related process or program is started so that non-LAM code can
> detect LAM and take appropriate action.

I see two options -- both of which are far too late for LAM 7.0,
unfortunately:

1. Add an option to all the "boot" commands (specifically to the rsh/ssh
boot SSI module) that makes LAM ignore anything on stderr and not treat it
as an error.

2. Add an ability something akin to what you described -- the ability to
specify a command line option to insert either a standard environment
variable that the remote rsh/ssh target can see in their environment.

I'm kinda leading towards #1.

I specifically mention the rsh/ssh boot module, but this will also apply
to Globus as well (vis the globusrun executable). This is not so much a
factor in TM or bproc environments because the "dot" files are not run on
the target nodes.

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/