LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-07-01 16:48:10


The only way that I can think to do this is to have some other proxy
application that does the forking for you -- i.e., the proxy issues the
mpirun, and your MPI app somehow sets up communication with the proxy
(perhaps passing in via the MPI app's command line callback information
on how to contact the proxy). When your MPI app needs to fork
something, it sends a command to the proxy who actually does the
forking.

This is admittedly a bit convoluted, but it should be portable between
all MPI implementations...

On Jul 1, 2005, at 5:05 AM, Philip Mason wrote:

> I am currently testing with standard tcp RPI device, with a view
> to using gm/ib at some point.
>
> I have tried running on both Linux 2.6(SLES9) and 2.4(SLES8).
>
> Not sure I can pre-fork jobs as I need to make sure it is only
> the master MPI process that forks (and to do this I would need to
> MPI_Init/MPI_Comm_rank...).
>
> Can't really mpirun additional processses as they are non-MPI
> third-party programs which use their own communication protocol.
>
> Thanks anyway for your help.
>
> Jeff Squyres wrote:
>> I have a guess as to what is happening, but it depends on your
>> situation. Are you using the gm or ib RPI devices (which is why you
>> say you need the memory manager)?
>>
>> What OS are you running on? If it's Linux, is it 2.4 or 2.6?
>>
>> I should say that technically, MPI does not guarantee that fork will
>> work. Indeed, gm and ib apps may not even guarantee that fork works
>> properly (it's been a while since I've checked; I don't know whether
>> they currently support it or not). Hence, you might have larger
>> problems than LAM's memory manager, and we can't really help you. :-\
>>
>> An alternate approach might be to pre-fork jobs (i.e., before
>> MPI_INIT), similar to what Apache does, and then communicate with them
>> (potentially via pipes) when you need to use them.
>>
>> Or perhaps you can mpirun all the additional processes that you need,
>> and simply have the extras blocking in an MPI_RECV waiting for
>> instructions.
>>
>> Are any of these possible?
>>
>>
>>
>> On Jun 30, 2005, at 7:36 AM, Philip Mason wrote:
>>
>>
>>> I am trying to run a simple program (np=2) using LAM 7.1.1
>>> on Linux. The parent program spawns(forks) a new process
>>> and then performs a few malloc/putenvs. However, this
>>> forked-process dies inexplicably in one of the calls to malloc.
>>>
>>> I am using simple LAM setup and have tried on various linux
>>> machines including SLES8(AMD64), RH72(IA32).
>>>
>>> I have tried to run with LAM configured with memory manager off
>>> (i.e. --with-memory-manager=none) and this seems to fix the problem.
>>> However, I do want the memory-manager to be activated.
>>>
>>
>
> --
> ----------------------------------------
> Phil Mason
> Software Development Engineer
> Ricardo Consulting Engineers Ltd
>
> Email - Philip.Mason_at_[hidden]
> Tel: 01273 794914
>
>
> --
>
> This e-mail and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed.
> If you have received this e-mail in error please notify the sender
> immediately and delete this e-mail from your system. Please note that
> any views
> or opinions presented in this e-mail are solely those of the author
> and do not necessarily represent those of Ricardo (save for reports
> and other
> documentation formally approved and signed for release to the
> intended recipient). Only Directors or Duly Authorised Officers are
> authorised to
> enter into legally binding obligations on behalf of Ricardo unless
> the obligation is contained within a Ricardo Purchase Order.
>
> Ricardo may monitor outgoing and incoming e-mails and other
> telecommunications on its e-mail and telecommunications systems. By
> replying to
> this e-mail you give consent to such monitoring. The recipient should
> check e-mail and any attachments for the presence of viruses. Ricardo
> accepts no liability for any damage caused by any virus transmitted
> by this e-mail. "Ricardo" means Ricardo plc and its subsidiary
> companies.
>
> Ricardo plc is a public limited company registered in England with
> registered number 00222915.
> The registered office of Ricardo plc is Bridge Works,Shoreham-by Sea,
> West Sussex, BN43 5FG.
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/