Table of contents:
- What is LAM/MPI?
- What is "SSI"?
- What is an "SSI Parameter"?
- What is an "available SSI module"?
- What is a "selected SSI module"?
- What is the "type" of an SSI module?
- What is an "MPI process"?
- What is a "LAM process"?
- What is a "Request Progression Interface (RPI) module"?
- What is an "origin node"?
- What is a "local node"?
- What is a "boot schema"?
- What is an "application schema"?
- What is a "Beowulf cluster"?
[ Return to FAQ ]
LAM (Local Area Multicomputer) is an open source implementation of the
Message Passing Interface
(MPI) standard. The MPI standard is the de facto industry standard
for parallel applications. It was designed by leading industry and
academic researchers, and builds upon two decades of parallel
programming experience.
Implementations of MPI (such as LAM) provide an API of library
calls that allow users to pass messages between nodes of a parallel
application (along with a bunch of other bells and whistles). See the
MPI Forum web site for more
details about MPI.
LAM/MPI provides users not only with the standard MPI API, but
also with several debugging and monitoring tools. While specifically
targetted at heterogenous clusters of Unix workstations, LAM runs on a
wide variety of Unix platforms, from desktop workstations to large
"supercomputers" (and everything in between).
LAM includes a full implementation of the MPI-1 standard as well
as many elements of the MPI-2 standard. The release notes for the
latest version of LAM include a fuller list of features.
The LAM/MPI Installation Guide and User's Guide (both included
with the LAM/MPI distribution and available on the LAM/MPI web site)
and lam(7) manual page is helpful in defining what MPI
is, and how LAM relates to MPI. Novice users are strongly
encouraged to read the User's Guide.
[ Top of page | Return to FAQ ]
2. What is "SSI"?
Applies to LAM 7.0 and above |
SSI stands for System Services Interface. It is a component framework
that comprises the core of LAM/MPI. Many of LAM's back-end services
are now performed by SSI modules. The SSI framework allows the
selection of modules at run-time without the need to re-compile user
MPI applications (e.g., the choice of which underlying network message
passing system to use).
There are currently four types of SSI modules:
-
boot: Forms the back-end of lamboot,
recon, and wipe. Allows the LAM run-time
environments to be natively launched in various differerent operating
environments.
-
coll: MPI collective communications.
-
cr: Interfaces to different back-end
checkpoint/restart systems that allows the pausing and resuming of
parallel MPI jobs.
-
rpi: MPI point-to-point communications (the Request
Progression Interface).
[ Top of page | Return to FAQ ]
3. What is an "SSI Parameter"?
Applies to LAM 7.0 and above |
SSI modules have the ability to receive parameters from the user at
run-time. The most common methods to pass SSI parameters are via
command line switches or environment variables (command line switches
will take precedence over environment variables). Unless specifically
noted, SSI parameters must be set before an MPI program is executed.
For example, calling the C function setenv() after an MPI
process has started does not guarantee to cause the parameter to get
set properly.
mpirun and lamboot can take the
"-ssi" command line switch with two additional values:
the name of the SSI parameter and the value to set it to. For
example, to set the maximum size of short messages in the TCP RPI module:
shell$ mpirun -ssi rpi_tcp_short 131072 C my_mpi_program
Environment variables can also be used. Simply prefix the SSI
parameter name with "LAM_MPI_SSI_". Such variables will
typically be automatically propogated out to relevant nodes. Consider
the following (Bourne shell example):
shell$ LAM_MPI_SSI_rpi_tcp_short=131072
shell$ export LAM_MPI_SSI_rpi_tcp_short
shell$ mpirun C my_mpi_program
The following is the same example in C-shell-style shells:
shell% setenv LAM_MPI_SSI_rpi_tcp_short 131072
shell% mpirun C my_mpi_program
A full listing of all the SSI parameters is listed in the LAM/MPI
User's Guide.
[ Top of page | Return to FAQ ]
|
4. What is an "available SSI module"? |
When an SSI module is described as "available", that means that it has
been queried (usually at process startup time) and has indicated that
it is willing to run in the current process. Hence, the final
module(s) that are selected will be from the set of modules who
inidicated that they are available.
It is possible that modules will report that they are
not able to run. For example, the gm RPI SSI
module will report that it is unable to run is there is no Myrinet
hardware available in the machine that it is running on. In such
cases, the gm module will not be among the set of modules
available for selection.
[ Top of page | Return to FAQ ]
|
5. What is a "selected SSI module"? |
When an SSI module is described as "selected", that means that it is
going to be used in the current process.
Specifically, a selected module had previously indicated that was
it available for selection, and then it won the selection process and
became eligible for use in the current process. Depending on the SSI
type, one or more modules may be selected during a single process.
[ Top of page | Return to FAQ ]
6. What is the "type" of an SSI module?
Applies to LAM 7.0 and above |
SSI is a component architecture with several different types of
components. Also known as the "kind" of SSI module, the "type" of a
module refers to what class of functionality that module provides.
The types that are currently defined are:
boot: Used to establish the LAM run-time
environments, typically as the back-end to the lamboot
command.
coll: Used to perform MPI collective communications.
cr: Used to checkpoint and restart parallel MPI jobs.
rpi: Otherwise known as the Request Progression
Interface (RPI), this module type is used to effect MPI point-to-point
communication.
Modules of each type are all available in the LAM/MPI distribution.
Documentation for the protocols and API's for each type are available
on the LAM/MPI web site -- those wishing to extend the LAM
implementation and/or experiment with different aspects of MPI (e.g.,
write new algorithms for MPI's collective functions) are encouraged to
see the "Documentation" section of the LAM/MPI web site for details.
[ Top of page | Return to FAQ ]
|
7. What is an "MPI process"? |
A UNIX process becomes an MPI process by invoking
MPI_Init(). It ceases to be an MPI process after
invoking MPI_Finalize(). Note that an MPI process --
when running under LAM/MPI -- is also a LAM process.
An MPI process may have zero or more peers that were launched at
the same time (via mpirun or mpiexec or one
of the MPI-2 dynamic process function calls) that are in the same
MPI_COMM_WORLD. Other MPI process peers are also
possible -- the MPI-2 dynamic process calls allow for subsequent
connections and attachments after an MPI process is initially launched.
[ Top of page | Return to FAQ ]
|
8. What is a "LAM process"? |
A UNIX process becomes a LAM process by attaching to the local LAM
daemon via a library function. Once the process detaches from the LAM
daemon it is no longer considered to be a LAM process. An MPI process
run under LAM/MPI becomes a LAM process in MPI_Init() and ceases to be one after MPI_Finalize().
[ Top of page | Return to FAQ ]
|
9. What is a "Request Progression Interface (RPI) module"? |
An RPI module in LAM is the manner in which an MPI point-to-point
message progresses from a source MPI process to a destination MPI
process. RPI is just one of the types of SSI modules that LAM/MPI
uses.
LAM offers multiple RPI modules in the default distribution:
- gm: using native Myrinet message passing
- ib: using native Infiniband message passing
- lamd: using asynchronous (but slow) UDP-based message passing
- tcp: using TCP
- sysv: using shared memory (with SystemV semaphores) and TCP
- usysv: using shared memory (with spin locks) and TCP
The two shared memory RPIs use shared memory for communication
between MPI ranks on a single node, but use TCP/IP for communication
between ranks on different nodes. The only difference between them
is
the manner in which shared-memory synchornization is performed. The
usysv (spin locks) RPI is probably best suited for SMP
architectures; the sysv (semaphores) RPI is probably
best
suited for uniprocessors.
[ Top of page | Return to FAQ ]
|
10. What is an "origin node"? |
An origin node is the node on which "lamboot" was invoked to boot up
the LAM.
[ Top of page | Return to FAQ ]
|
11. What is a "local node"? |
When discussing a command, we refer to the node on which the command
was invoked as the local node.
[ Top of page | Return to FAQ ]
|
12. What is a "boot schema"? |
A boot schema is a description of a multicomputer on which LAM will be
run. It is usually a list of hostnames on which LAM will be booted,
and is also sometimes called a "hostfile". However, a "boot schema"
can also contain a username for each machine, for the case where a
user has multiple account names across different computers. The "boot
schema" can also contain information regarding the number of
processors in a machine. For more information on multi-processor
machines, see the question "How do I lamboot multi-processor
machines?" in the "Booting LAM" section.
[ Top of page | Return to FAQ ]
|
13. What is an "application schema"? |
An application schema is a breakdown of applications that will be
launched on each node in the multicomputer. Specific applications and
options can be identified for each node. Application schemas are
typically used for MPMD applications where different binaries need to
be launched on each node.
[ Top of page | Return to FAQ ]
|
14. What is a "Beowulf cluster"? |
The term "Beowulf cluster" refers to a cluster of workstations
(usually Intel architecture, but not necessarily) running some flavor
of Linux that is utilized as a parallel computation resource. The main idea is to use commodity, off-the-shelf computing components to create a networked cluster of workstations.
LAM is a favorite tool of many Beowulf cluster users; they use LAM
to write parallel programs to run on their clusters. LAM tends to be
"cluster friendly" by using small daemons to effect fast process
control, application startup/shutdown, etc.
[ Top of page | Return to FAQ ]
|