On Sep 12, 2008, at 9:39 AM, sid de wrote:
> moreover what if i want a program which spawns multiple slave
> processes and should any of the slave processes fails the master
> immediately comes to know and redstributes the job ...
> any simple code examples !!
Sorry about the slow reply -- I've been on vacation for the last
couple of weeks.
Have a look at the example in examples/fault/ in the source tree of
the LAM/MPI releases. There's a README in there that describes the
(limited) capabilities of that particular model of fault tolerance in
LAM/MPI.
Brian
--
Brian Barrett
LAM/MPI Developer
Make today a LAM/MPI day!
|