LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Bogdan Costescu (bogdan.costescu_at_[hidden])
Date: 2004-01-08 10:05:02


On Thu, 8 Jan 2004, Neil Storer wrote:

> Because IBM's AIX implementation of MPI does not support process
> creation and management, some of our users have asked if it would be
> possible to install LAM-MPI on our system.

By "process creation" do you mean things like fork+exec or thread creation ?
By "management" do you mean signals or something else ?
Have you already tested that LAM-MPI works well with your application ?
There are issues that you have to be aware of (like serialize thread
access to communicator or which signals can be used); they are well
documented on the LAM-MPI website.

> over 50 SMP nodes, each node being a 32-CPU POWER4 p690 server

Nice system :-)
Given the size of the purchase, are you not able to convince IBM to add
the features that you need to their MPI implementation ?

> 1. Do you see any problem in installing LAM-MPI on such a system at that
> level of operating system?

Well, if you want to use all CPUs for one job... that would mean 1600
CPUs. Maybe the llamas can comment on scalability of LAM-MPI (how fast to
lammboot, how fast to start the job at mpirun time, how fast to lamhalt,
if the LAMD-LAMD communication is still fluent); the tens of CPUs that I'm
seeing here per jobs certainly do not count as comparison :-)

> 3. Do you see any reason why using LAM-MPI would not be a good thing to
> do on our system?

Yes, your next point :-)

> 4. IBM's MPI uses LAPI as the underlying protocol and this will be tuned
> and developed for Federation. My understanding is that LAM-MPI would
> have to use TCP/IP on our system

Yes, unless you volunteer to port LAM-MPI to LAPI :-)

> with a possible degradation in performance caused by this.

That's very mildly put :-) I have worked with MPI on LAPI only on old SP2
IBMs, but I had the impression that the scaling was quite good. On the
other hand, MPI over TCP/IP would probably scale very poorly for jobs
using hundreds of CPUs... unless they communicate between nodes only
seldomly - which is not my impression about weather forecasting codes :-)

> Is LAM-MPI "LoadLeveler-aware"

Not that I'm aware of. Integration is there for *PBS and (partly?) SGE.

> can I set up a BATCH job that LoadLeveler will schedule onto a set of
> nodes that will then be used by my (LAM-)MPI program?

That is a different question :-) If LoadLeveler can give a list of nodes
allocated for the job, you can write a short script that converts this to
a LAM-MPI bhosts list that would be used by lamboot. However starting the
LAMDs during lamboot might be a problem... your point 7 below.

> 7. We only allow a very small INTERACTIVE service that is restricted to
> using just a few CPUs in one node. "telnet", "rsh", "ssh" and so on are
> prevented from running on the other nodes as these are reserved for
> BATCH jobs. Since it isn't possible to "rsh" onto the BATCH nodes, would
> this be a problem for LAM-MPI?

Yes. lamboot needs LAMRSH (by default rsh, but can be changed) to start a
LAM daemon on a remote node. This is where the "aware-ness" of the batch
system comes into play; for *PBS, LAM daemons can be started using TM, so
no need for LAMRSH...

> 8. Some of our programs use a hybrid parallel programming padigm of MPI
> across nodes, with OpenMP within the nodes.

What compiler will you be using that supports OpenMP ?

> For this of course we have to use thread-safe libraries. Can
> such an approach be utilised with LAM-MPI too?

http://www.lam-mpi.org/faq/category7.php3#question5

-- 
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]