LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian W. Barrett (brbarret_at_[hidden])
Date: 2002-09-11 15:29:20


Just a followup for the list archives. Jeremy is currently thinking that
this is a hardware / operating system problem. Something painful sounding
like the NIC processors overheating :(.

Brian

On Thu, 5 Sep 2002, jeremy archuleta wrote:

> i believe that lam6.6b1 has problems in running a lot
> (as in 100's) of successive jobs. at least, running my
> code ; )
>
> it appears that lam6.6b1 is not releasing memory
> correctly, because when lam refuses to run anymore,
> memory usage, given by "top", is around 95% and my
> code only uses 16MB spread out over 4 nodes for each
> run with 2 "malloc's" and 2 "free's".
>
> but i have also found that lam6.5.6 can run without
> problem for 1140+ runs (i killed it).
>
> one last thing, when it stops i can then "lamhalt",
> but i can't "lamboot" because lamboot can't boot the
> origin node, but it can boot the remote nodes.
>
> i am curious to know if anyone else has had these
> types of problems. if you haven't, but would like to
> test my code, i can send it to you...it's a small heat
> equation code.
>
> that's it.
> oh. and this version passed all the lamtests when i
> installed it.
> -j
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Finance - Get real-time stock quotes
> http://finance.yahoo.com
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
  Brian Barrett
  LAM/MPI developer and all around nice guy
  Have a LAM/MPI day: http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/