LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Pierre Valiron (Pierre.Valiron_at_[hidden])
Date: 2005-06-13 07:23:22


Jeff Squyres a écrit :

> On Jun 5, 2005, at 1:18 PM, Pierre Valiron wrote:
>
>> I suppose lamhalt post some asynchronous request to the daemon, and
>> if the TMPDIR is deleted too quickly the daemon is prevented to
>> halt. If I add some sleep between lamhalt and rm the daemon is
>> generally properly halted.
>
>
> This is exactly what is happening. Generally, when you lamhalt, it
> takes 1-4 seconds for the LAM universe to finish coming down *after*
> lamhalt returns. Some relatively uninteresting daemon ordering issues
> are the cause of this -- search this list's archives for discussions
> about it, if you care.
>
>> Of course there is a workaround using
>>
>> export LAM_MPI_SESSION_PREFIX="/some/permanent/path"
>> export LAM_MPI_SESSION_SUFFIX="some_unique_name"
>
>
> I'm a little confused -- this should suffer exactly the same problem
> as you described above. The mechanism for where LAM's session
> directory is located/found does not affect the takedown time of lamhalt.

I don't agree.

If the daemons are attached to a volatile directory, lamhalt comes into
trouble when the directory is destroyed. If the deamons are attached to
a permanent one, then lamhalt gets all the time he needs to kill the lam
universe and remove the corresponding files.

>
>> However it is elegant to use the unique TMPDIR to trigger a unique
>> LAM universe... and this should work, or if it can't be fixed for
>> some reason the doc should be updated accordingly.
>
>
> You're probably right; this has bitten enough people that we should do
> something about it.
>
> However, I literally just noticed that the LAM tarballs do not include
> the lamhalt man page (it exists -- I swear it!). !@#$@!$!!
>
> I'll go fix that for 7.1.2...
>
Thanks !

Best regards.
Pierre.

-- 
Soutenez le mouvement SAUVONS LA RECHERCHE :
http://recherche-en-danger.apinc.org/
       _/_/_/_/    _/       _/       Dr. Pierre VALIRON
      _/     _/   _/      _/   Laboratoire d'Astrophysique
     _/     _/   _/     _/    Observatoire de Grenoble / UJF
    _/_/_/_/    _/    _/    BP 53  F-38041 Grenoble Cedex 9 (France)
   _/          _/   _/      http://www-laog.obs.ujf-grenoble.fr
  _/          _/  _/        mail: Pierre.Valiron_at_[hidden]
 _/          _/ _/      Phone: +33 4 7651 4787  Fax: +33 4 7644 8821
_/          _/_/