LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-10-19 16:01:17


On Oct 18, 2004, at 9:07 AM, atarpley wrote:

> I am trying to convert an existing system from PVM to MPI. I have a
> few
> questions and statements and I would appreciate any comments on them.
>
> The system requires dynamic process management. I understand MPI2 has
> this
> functionality. The system basically has a “Service Configurator” that
> launches multiple “Services.” Each Service can be connected to any
> number of
> other Services. With MPI, there is a MPI_Comm_spawn method. I have
> identified this as the best way to launch external executables with
> MPI.
>
> Question 1: Does the application spawned by MPI_Comm_spawn have to be
> an MPI
> application itself? I have read that it does have to, but I have
> successfully
> launch Unix apps like xcalc, which I know is not an MPI app.

The MPI standard says yes, they have to be MPI applications. LAM/MPI,
for example, expects them to be MPI applications and will indefinitely
hang the parent(s) until a spawned non-MPI app exits, which will then
cause a run-time error in the parent(s).

> Once Services are launched from the “Service Configurator,” they must
> establish communication with each other. I have identified the socket
> like
> behavior of MPI as suitable for this (MPI_Open_port, MPI_Comm_connect,
> etc).

Right.

> Question 2: My system needs to be able to handle dynamic Service
> disconnection
> commands. If I wanted to disconnect Service A from B and connect A to
> C, it
> seems that I’d have to instruct A to disconnect from B, tell C to call
> MPI_Comm_accept, then tell A to connect to C. Is there anyway to do
> this in a
> non-blocking manner?

Unfortunately, not without threads. The MPI standard defines them as
blocking. LAM, for example, does not have a timeout period -- accept
will hang indefinitely waiting for someone to connect.

> What if I wanted to connect multiple Services to Service
> C? Would Service C have to do multiple blocking MPI_Comm_accept calls?
> Wouldn't this create a a logjam with the other Services trying to
> connect to
> Service C?

Yes. Just like sockets. There's really no other way to do it -- if
you want to have N entities connect to 1 entity, then you have to
serialize somewhere.

> Question 3: My system requires a high fault tolerance. With permanent
> MPI
> pipelines being open between Services and between Services and the
> “Service
> Configurator,” isn’t there a significant chance for an error in
> communication
> to bring down the entire MPI system?

Yes. You might want to examine the MPI definition of "connected" and
its relation to MPI_FINALIZE and MPI_ABORT -- these things are
discussed in the MPI-2 standard, the dynamic processes chapter. The
end result is that MPI allows an error to take down *all* connected
processes, but allows an implementation to do something better if it
wants.

> Is there anyway to recover smoothly from
> a seg fault or fatal error in communication? As a test, I purposely
> caused a
> seg fault in a Service while it was connected to another Service with
> MPI and
> it brought down both Services. Any way around this?

Don't have seg faults. ;-)

Right now (i.e., the current state of MPI implementations), not really.
  :-(

FT is an area still largely unexplored in MPI. Some people have done
some work in this area -- e.g., FT-MPI has done some interesting stuff,
and we'll be extending that work, LAM's checkpoint/restart stuff, and a
variety of other FT things (like data fault tolerance, run-time
exception tolerance, etc.) in Open MPI.

> Question 4: Are there any caveats to making an MPI shared object? I
> would
> like all of the Services to dynamically use a shared “Dispatcher”
> object that
> uses MPI as the message passing paradigm.

I assume you mean a shared library that is loaded and unloaded from a
process at run-time?

No, that should work fine. However, be aware that the MPI standard
says that MPI_INIT/MPI_INIT_THREAD and MPI_FINALIZE are only allowed to
be called once per process. That being said, you can *probably* get
around that error if you *compleletely* unload the MPI portion of your
app from the process -- i.e., there's no state left to tell MPI (upon a
later re-load) that it was previously MPI_INIT'ed.

> Question 5: Lastly, I don’t even know if MPI is right for this kind of
> system.
> The Service Configurator and Services will always be using just one
> MPI
> process each. So in effect, the only thing I am using MPI for would
> be the
> message passing between Services and between Services and the Service
> Configurator. Is this proper use of MPI?

Sure. MPI is a generic message passing system, and most MPI
implementations define an MPI process as an operating system process.
However, not many of them support concurrency in threads -- Open MPI
does. Open MPI supports MPI_THREAD_MULTIPLE, meaning that even though
Open MPI defines an MPI process as an OS process, you can still use
multiple threads concurrently and even have one thread send a message
to another thread. You can also do stuff like allow MPI_COMM_ACCEPT to
block in a thread while your app goes off and does other stuff. This
allows so fairly interesting scenarios.

However, you do have to obey MPI semantics. You need to ask yourself
what the benefits and drawbacks are. Are high bandwidth and low
latency requirements? Or is writing this with sockets easier (and
therefore always using TCP) because you'll potentially have less
constraints (particularly in the FT arena)?

Sorry that some of this has sounded like an advertisement for Open MPI;
I'm actually quite excited about it because we're doing things in an
MPI implementation that should have been done a long time ago and will
enable things like you're trying to do.

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/