LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2004-10-14 13:20:57


On Oct 14, 2004, at 9:37 AM, Qikai Li wrote:

> The same code runs perfectly under LAM 7.0.6 with a stable memory
> usage.
>
> Several guys in our group have experienced the same problem when I
> switched the LAM from 7.0.6 to 7.1.
>
> Maybe this is related to a possible bug in gcc, i.e., the memory is NOT
> properly freed under 64-bit environment even though you have used, for
> example, pairs of malloc (or calloc) and free.
>
> Also, The problem seems to be only 64-bit specific.
>
> Or maybe it's the problem of LAM 7.1.

Thanks for the bug report. We only have access to one Opteron machine
and it doesn't have Myrinet, so I was wondering if you could run a
couple tests for me to help localize the problem. First, could you
send me the output from the "laminfo" command? There are a number of
places that changed between 7.0 and 7.1, so I'm hoping we can localize
it to a particular component. Does the memory leak happen regardless of
number of processes running?

Also, could you see if it happens with the following SSI options:

   -ssi rpi tcp (use tcp instead of gm)
   -ssi coll lam_basic (use the really simple collectives code)

You specify the ssi params during mpirun, so something like: "mpirun
-np 4 -ssi rpi tcp ./a.out"

I'm looking at the problem as well, but having some starting points
would really help.

Thanks!

Btian

-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have an LAM/MPI day: http://www.lam-mpi.org/