On Thu, 13 Jan 2005, Jess Andreas Michelsen wrote:
> forrtl: severe (41): insufficient virtual memory
>
> originating from one or more of the nodes. The crash always happens at
> the same place in the program, but the nodes on which it happens differ
> between runs.
I have seen some strange behaviour as well with Fortran codes due to
the exec-shield feature of the recent Red Hat kernels and maybe some
other (NPTL ?) related changes. They happen as well when using a
non-parallel binary (so LAM has nothing to do with it) and the way I
got rid of them was to disable exec-shield (there's a /proc entry that
can be changed) and to increase the defaut stack size. I've done this
mainly on FC1/RHEL3 on lots of machines, but also tried a few runs on
a FC3 machine.
> A mere 'top' shows the memory usage being stable, i.e. not growing over
> time, and appr. 1/10 of the physical memory on each node is reported as
> utilized.
This would indeed support my thoughts above: there is enough memory,
but the application doesn't get it the way it expects.
--
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]
|