LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-08-31 17:44:05


On Thu, 2006-08-31 at 22:29 +0800, 龚念 wrote:
> Hello everyone:
> I used openssi and drbd to buildup a cluster system 2 weeks ago.
> Now I want use LAM/MPI to do some checkpoint tasks. As I know LAM/MPI
> support SMP machines. Cluster used openssi is like a large SMP
> machines. So I want to know if LAM/MPI can be used in openssi
> clusters!!! Anybody who used it can tell me.
> I am a newbie of using LAM/MPI,so plz help me. Plz reply
> soon,I'm urgent to know it........thanks

LAM/MPI generally has issues running on SSI-like systems, especially if
they support migration of applications between nodes (and definitely
with checkpoint / restart systems not explicitly supported by LAM). The
issue boils down to TCP sockets - if a process moves, the TCP sockets
have to follow, and my experience has been SSI systems don't get this
right. That being said, if you can make OpenSSI not move processes
around (I have no idea -- I'm not familiar with OpenSSI at all), then
you might have a reasonable chance of making LAM work under OpenSSI.
But this is definitely not a configuration we support.

Brian