Information
about IMPI
- The Interoperable Message Passing Interface (IMPI) is a
standardized protocol that enables different MPI implementations to
communicate with each other. This allows users to run jobs that
utilize different hardware, but still use the vendor-tuned MPI
implementation on each machine. This would be helpful in situations
where the job is too large to fit in one system, or when different
portions of code are better suited for different MPI
implementations.
- IMPI defines only the protocols necessary between MPI
implementations; vendors may still use their own high-performance
protocols within their own implementations.
- Some new terms that are used throughout the LAM / IMPI
documentation include: IMPI clients, IMPI hosts, IMPI processes, and
the IMPI server. See the IMPI section of the the LAM FAQ for definitions of these
terms.
- For more information about IMPI and the IMPI Standard, see the
main IMPI web site at http://impi.nist.gov/
Supported IMPI functionality
Note that the IMPI standard only applies to MPI-1 functionality.
Using non-local MPI-2 functions on communicators with ranks that live
on another MPI implementation will result in undefined behavior (read:
kaboom). For example, MPI_COMM_SPAWN will certainly
fail, but MPI_COMM_SET_NAME works fine.
LAM currently implements a subset of the IMPI functionality:
- startup and shutdown
- all point-to-point sendings and receiving
- some of the data-passing collectives:
MPI_ALLREDUCE, MPI_BARRIER,
MPI_BCAST, MPI_REDUCE
LAM does not implement the following on communicators with ranks that
live on another MPI implementation:
-
MPI_PROBE and MPI_IPROBE
-
MPI_CANCEL
- all data-passing collectives that are not listed above
- all communicator constructor/destructor collectives (e.g.,
MPI_COMM_SPLIT, etc.)
Running an
IMPI job
- Compile your MPI program using the LAM
mpicc, mpiCC, or mpif77 (or
hcc, hcp, hf77 -- these are
equivalent to their mpiXXX counterparts).
- The IMPI server will be opening a socket that will accept
connections until all IMPI clients have connected. Until all clients
have connected, however, there is an opening for a malicious user to
attack the port. IMPI requires that a client must be authenticated by
the server or the connection is terminated.
Currently LAM/MPI has two authentication protocols implemented:
IMPI_AUTH_NONE and
IMPI_AUTH_KEY. IMPI_AUTH_NONE is a
"no op" authentication protocol; it should only be used by sites that
can guarantee the security of their networks. The
IMPI_AUTH_KEY protocol requires that the server's and the
clients' keys match before allowing the connection. Both of these
protocols are currently mandated for all IMPI clients and hosts.
To use the IMPI_AUTH_NONE protocol, set the
IMPI_AUTH_NONE environment variable. To use the
IMPI_AUTH_KEY protocol, set the
IMPI_AUTH_KEY environment to the value of the key. The
value of the key must be the same on the IMPI client and server.
For simplicity, you may wish to set these environment variables in
your $HOME/.cshrc, $HOME/.profile,
$HOME/.bashrc, or your shell's startup file.
For a C shell, or csh derivative, do the following:
% setenv IMPI_AUTH_NONE
or
% setenv IMPI_AUTH_KEY 0123456789
For a Bourne Shell or sh derivative, do the following:
% IMPI_AUTH_NONE=""
% export IMPI_AUTH_NONE
or
% IMPI_AUTH_KEY="0123456789"
% export IMPI_AUTH_KEY
The IMPI protocol has been designed to allow for new
authentication methods to be added; there may be multiple
authentication schemes available for the user to specify. The LAM
team suggests that users choose the "strongest" or "best" method
available. Since an IMPI server and client may not have the same
authentication schemes available, the user can construct a preference
list indicating which schemes should be used, and preferable order in
which to try to attempt the various schemes.
The preferecne list has the highest number listed first and then a
range to pick from. The exact scheme is then negotiated when an IMPI
job is started. LAM/MPI and the IMPI server from Indiana University,
for example, will choose the best method among the preference list
first.
For more information on the IMPI authentication mechanisms, see
Chapter 2, "Startup/Shutdown" in the IMPI standard.
- Launch a server using the appropriate command(s) for the IMPI
server that you are using. The IMPI server will probably expect at
least 2 command line arguments, and at most 8. For the IMPI server
from Indiana University, the command is of the following
form:
% impi-server -server <count> [-v]
[-port <port_number>]
[-auth <protocol-list>] [-bg]
The options are as follows:
-
<count> refers to the number of client
connections that the server expects to receive.
- Verbose mode can be enabled with the
-v option.
- If no
-port is specified, the server will pick an
available port at random. If -port is specified, and the
port is already in use, the server will abort.
- For
-auth, the protocol-list should
be from most-preferable first to least-preferable last. If no
-auth <protocol-list> is provided, the IMPI server
and LAM version will negotiate to use the
strongest authentication protocol available (which has been enabled
via the environment variables).
For example, to run the canonical MPI ring program
around a ring of 3 LAM/MPI clusters or clients, the command could look
like:
% impi-server -server 3 -port 3000 -auth 1-0
The server outputs its IP address and the port number to be used
in launching the clients. The server will output something similar
to:
129.74.48.55:3000
- Launch the LAM/MPI client by doing a normal
lamboot
on the cluster, and use the impirun command to execute
your IMPI program.
% impirun -client <rank> <host:port> <cmd_line>
Other than the addition of the -client option
(including the rank, and host:port
parameters to -client), the command line is the same as a
regular mpirun command line. All of the
mpirun options are valid for an IMPI job with the
exception of the -O option for a homogeneous cluster.
The rank specifies where the processes belonging to
this client are placed in the MPI_COMM_WORLD relative to
the other clients' processes. The rank must be a unique
number between 0 and (count - 1) from the command line to
launch the server.
For example, if three LAM/MPI clients that have been
lambooted on separate clusters, the commands to launch a
ring MPI program across all three clusters would like the
following:
cluster1% impirun -client 0 129.74.48.55:3000 N ring
cluster2% impirun -client 1 129.74.48.55:3000 N ring
cluster3% impirun -client 2 129.74.48.55:3000 N ring
Notice that each impirun gives a different
<rank> argument, but all give the same
<host:port> argument.
- Once the program is running, a normal
lamclean will
kill all ranks within a single LAM cluster. It will
probably be necessary to do a lamclean (or the
equivalent) on all IMPI clients if a job does not complete properly.
|