Very sorry for the lack of clarity. This solution works in the LAM 7.x
series in LSF as far back as v4.2
chris
_____
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
Of YoungHui Amend
Sent: Tuesday, April 25, 2006 5:33 AM
To: General LAM/MPI mailing list
Subject: Re: LAM: lamboot without rsh/ssh
Is this "lsgrun -m" only available in LSF 7.x series. My customer is
running LSF 6.1.
_____
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
Of Christopher Porter
Sent: Monday, April 24, 2006 4:03 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: lamboot without rsh/ssh
There is a method under the 7.x series in an LSF environment to use a
remote the execution utility called "lsgrun". Below is an excerpt from
a document I've been writing.
ENVIRONMENT
Several enterprise environments which use Platform software to manage
their workload also put constraints on users to goad them into using the
LSF system. One of these constraints is denying direct login to
execution hosts. In general this does not impede LSF from
allowing users to submit and hand off execution launch of their jobs to
LSF, but in the case where LAM MPI is required, such an environment is
problematic because of the required "lamboot" process. This process
assumes rsh/ssh login access to get the MPI daemons started. Such an
assumption is in coflict with the no-login environment, and can force
problematic exceptions to environment management rules.
In some cases, third party applications are delivered with LAM-MPI
libraries pre-compiled so solutions to this problem should not require
code modification or recompilation of the LAM libraries if at all
possible.
GOAL
Since LSF has it's own internally used authentication mechanism, instead
of using rsh/ssh it is possible to use the base command "lsgrun" to make
the remote connection to execution hosts and launch the daemons. Once
the daemons are launched, the job can be launched in a similar fashion
to any serial job.
METHOD
LAM-MPI has always used a default remote connection method to start the
communication daemons it uses during a parallel application run - "rsh".
As security became more important in compute environments, the
developers of LAM introduced an environment variable LAMRSH which
would be used if defined to redirect the remote connection application.
Most folks use "ssh". In this case we set it thusly:
LAMRSH="lsgrun -m"
This redirection though is not enough. There are some assumptions made
in the lamboot code which constructs the system call command lines and
assumes whatever redirect is used in the LAMRSH variable, the same
switches (such as "-n") perform the same function. This isn't true for
"lsgrun" so further modification is needed.
Thankfully recent versions of LAM documentation are complete enough to
provide enough of a peek under the hood that one can control the daemon
boot process at a more fundamental level.
http://www.lam-mpi.org/using/docs/ the user guide, specifically Chapter
8 describes LAM Modules. LAM uses a plugin architecture called "boot
modules" to provide ability for integrations with various environments
including Scyld Beowulf, Globus, PBS/Torque, and the generic rsh/ssh
environment. This generic environment is the one we intend to operate
upon since no explicit LSF integration exists.
Table 8.3 is specifically interesting because it details the additonal
parameters that can be passed to the boot module used by the lamboot
process. The three that are most interesting are boot_rsh_fast and
boot_rsh_ignore_stderr boot_rsh_no_n.
boot_rsh_no_n - removes the "-n" switch normally applied to rsh/ssh
system calls which
under those protocols redirects input from the
special device /dev/null
boot_rsh_fast - skips the step which determines the shell on remote
hosts and assumes
the local shell is used.
boot_rsh_ignore_stderr - the lamboot process will terminate before
completing if any data
shows up in STDERR by default. If "lsgrun"
substitution produces some
output to this device, this switch could be very
handy, especially if the
data is inconsequential to the overall goal.
boot_rsh_no_profile - prevents the daemon launch process from trying
to run the user's
.profile
One can sucessfully boot LAM using "lsgrun" in an environment where
users can not login via rsh or ssh (/etc/passwd contains shell
definitions of /bin/false for instance) with the following setup:
for the Bourne again Shell (bash):
1) export LAMRSH="lsgrun -m"
2) lamboot -v -ssi boot_rsh_no_n 1 -ssi boot_rsh_fast 1 -ssi
boot_rsh_no_profile 1 <schema file>
for Csh or tcsh:
1) setenv LAMRSH "lsgrun -m"
2) lamboot -v -ssi boot_rsh_no_n 1 -ssi boot_rsh_fast 1 -ssi
boot_rsh_no_profile 1 <schema file>
-----Original Message-----
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]] On Behalf
Of Phil Ehrens
Sent: Monday, April 24, 2006 12:44 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: lamboot without rsh/ssh
YoungHui Amend wrote:
> I'm using an old versions (6.3) of LAM/MPI. In this version of LAM,
> lamboot uses rsh to run hboot which forks lam demon (lamd). The
problem
> we are running in to is that rsh is not allowed (for security reasons)
> on the cluster of machines connected to LSF. Ssh also causes problems
> because it prompts you for the password. I know there's a way to
setup
> ssh so it doesn't prompt for a password, but it is not a viable
option.
> So, is there a way to fork lam demon without going through lamboot?
You have an installation where rsh can't be used due to security
concerns... and ssh can't be used because it's "not viable"?
Do they make you communicate with you coworkers by blinking flashlights
at them?
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/
|