LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-02-27 09:54:50


On Fri, 27 Feb 2004, Bogdan Costescu wrote:

> > Host mapping functionality can be applied to all non-out-of-band
> > communication (i.e., MPI communication -- not native LAM/nsend-based
> > communication). This nicely fits the "slow/admin" and
> > "fast/parallel" network model.
>
> That's exactly what I'm looking for. This is especially more important
> as a queueing system would nomally also be configured on the "slow"
> network and so lamd could only be started on the "slow" network. So, I
> want this feature NOW :-)

Unfortunately, we do not plan to support this in the 7.0 code base. 7.0
is in "maintenance mode" only at this point; we're targetting releasing
7.1 in the next few months.

> As you've put it into 7.1-cvs, maybe you can tell me how difficult would
> be to backport it to 7.0. Would a modified lamboot be sufficient or it
> has to be present in the parallel program binaries as well ?

This is used in all the RPI modules, so LAM would have to be recompiled,
and the application programs would need to be relinked.

That main functionality that I added was two things:

1. An SSI param for "mpi_hostmap" (the 7.0 code base does not have the
built-in parameter registration system that exists in 7.1; I believe it
would have to be done with vanilla getenv's in the 7.0 code base).

2. Hooks into all the rpi modules that use TCP (tcp, crtcp, sysv, usysv)
to do the address translation (e.g., share/ssi/rpi/tcp/src/ssi_rpi_tcp.c;
search for "hostmap").

3. The translation functions (share/ssi/rpi/base/ssi_hostmap.c).

4. The default [empty] installed hostmap file (etc/lam-hostmap.txt).
It's slightly easier if there's always a file there that can be parsed
rather than only parsing if there's a file there, etc.

5. Calls in the SSI startup and shutdown (open and close) to call the
hostmap startup and shutdown functions.

I've attached an extended grep across the CVS HEAD for all "hostmap" kinds
of things so that you can get an idea of where things are. The command I
used for grep was:

grep -r hostmap * | & grep -v Makefile | grep -v "Binary file" | grep -v \
/CVS/ | grep -v .deps | grep -v .lo: | grep -v HISTORY | grep -v doc/

> > There is now a new SSI parameter "mpi_hostmap" (its prefix of "mpi"
> > is meant to imply that it applies to all MPI SSI modules).
>
> I was thinking even farther than this, to allow a command that
> receives hostnames on stdin and writes mangled ones on stdout. The
> current, file-based behaviour, could then be emulated through 'cat
> hostmap'. Then, instead of writting a hostmap, I would use something
> like 'sed s/node/gige/'. Of course, this would add more complexity
> into LAM...

I'd prefer the static lookups, just for simplicity of implementation for
LAM and because it's already implemented. ;-)

It's roughly equivalent, either way -- chances are that the sysadmin is
going to setup the translation just once anyway. So whether you write a
regexp (or series of regexps if you have a highly heterogeneous cluster)
or have a list of concrete translations, it really comes down to the same
thing. Sure, you have to cut-n-paste a bit more for the static lookups,
but it's a pretty easy/straightforward to understand method.

Also, in future versions of LAM, this capability will be provided, but
more than likely through a different mechanism.

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/