Hi, thanks for your reply
I just tried get a nightly tarball (lam-7.1a1r9568) and it seems that
the bug it is still not fixed.
How can I know when the bug is adressed in the development version ?
Maybe I could help, if you have some hints of where to look at.
Thanks anyway
Karl
>Hi --
>
>My apologies for the very very late reply to this question!
>
>Well, I was able to replicate this with 7.0.4 when trying to do CTRL-C
>twice in quick succession. This seems to be taken care of in our
>repository development version. It probably would not go into our 7.0.5
>release due shortly, but will make into the 7.1 release (which may take
>a month's time). If this problem is not severe for you and you can wait
>until then, things are fine. Else you can get an anonymous checkout of our
>repository version or a latest nightly tarball (check
>http://www.lam-mpi.org/svn/). Warning: This version can be unstable since
>it is still under production.
>
>-Vishal
>
>
>On Fri, 19 Mar 2004, Karl Forner wrote:
>
># Hello,
>#
># I've been using LAM on production on two clusters for years, and there's
># a very annoying bug that is still present even
># in the last version.
>#
># When you kill a lam job, by example by typing 'CTRL+C' in the terminal,
># some files stay open by the lam daemon.
># Then the number of open files reach 71, and at this point, you can not
># any longer launch new jobs, you get an error message like :
>#
># lamexec (set_stdio): Too many open files in system
>#
># It is easy to reproduce : for example on a linux cluster, with redhat
># 7.2 running lam 7.0.4.
>#
># % lamboot -b -v
>#
># the get the pid of the lam daemon : e.g
># % PID=`pgrep lamd -u $USER`
>#
># then count the number of open files (plus one) :
># % ls -l /proc/$PID/fd | wc -l
># you should have 11 open files
>#
># then repeat the following process
>#
># launch a simple lam command
># % lamexec N sleep 10
># and interrupt it with one or two 'CTRL+C'
># you can check with " ls -l /proc/$PID/fd | wc -l" that the number of
># open files is increasing.
>#
># repeat it until you reach 71 open files, then you should have the error
># message.
>#
># Is this bug already referenced ?
># Do you need some help to fix it ?
>#
># Thanks Karl FORNER
>#
>
|