LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jiri Barnat (xbarnat_at_[hidden])
Date: 2005-05-09 07:11:03


>> 1) I would like to run a single computation that performs simultaneously
>> on computers in two different clusters. Unfortunately, computers in both
>> clusters are attached to local networks only, so they don't have
>> publically available IP addresses. Each cluster has an access point
>> instead, which is a computer that has two network devices and two IP
>> addresses, one local and one public. Is it possible to run such a
>> computation with LAM/MPI impirun and impi_server?
>
> It should be, yes. Note that there is currently a bug with regards to IMPI
> startup in LAM 7.1.1 (fixing it is one of the last things we're waiting for
> before releasing 7.1.2).

>> 2) I tried to run computation through an impi_server only on those 2
>> cluster access points mentioned above. However, connection didn't succeed
>> because local IP addresses were used instead of the public ones. Is there
>> any way, like environment variable or hidden impirun/mpirun parameter,
>> that will allow to say which IP address should be anounced through
>> impi_server to the others?
>
> In terms of the server, you should be able to use any IP address on the
> client that resolves to the node where the server is running. That is, even
> if the server reports a.b.c.d as its IP address on its stdout (e.g., if
> a.b.c.d turns out to be the private address), if a client that can only
> connect to that node via the public address w.x.y.z, the client should be
> able to use the public address as the argument to mpirun's -client parameter
> and it should work.
>
> The problem will come in the next step, however. The IMPI clients will
> upload their contact information to the IMPI server -- if they upload the
> wrong IP address (e.g., if they upload their private addresses), peer clients
> won't be able to connect.
>
> There is an environment variable that allows you to effectively override the
> local hostname that the client will use to determine its IP address. If you
> set the environment variable IMPI_HOST_NAME before running the local client,
> this is the name that the client will resolve to the IP address that is sent
> to the server. There is brief mention of this environment variable in the
> LAM/MPI User's Guide in section 12.6 -- you might want to check this out.

I make a progress with IMPI_HOST_NAME, thanks. However, I still have some
questions ...

1) I am able now to start a computation on 2 computers (just the
access-points of both clusters). However, the computation does not
initializes properly, which I guess is due to the bug you mentioned. Am I
right? I am using lam-7.2b1r10122, because using option --with-impi in the
last stable release produces compile errors.

2) I am still not able to start a computation if I want local nodes to
participate. I quess this is because local nodes of one cluster cannot
access directly the access point of the other cluster where impi_server is
running. So my question is which computer in the cluster runs impid, which
represents IMPI client, and whether there is a way how to change it. (Note
that in my case impid must be run on the access-point of the cluster).

Thank in advance for any answers.

Cheers,
Jiri