On Oct 14, 2004, at 9:37 AM, Qikai Li wrote:
> The same code runs perfectly under LAM 7.0.6 with a stable memory
> usage.
>
> Several guys in our group have experienced the same problem when I
> switched the LAM from 7.0.6 to 7.1.
>
> Maybe this is related to a possible bug in gcc, i.e., the memory is NOT
> properly freed under 64-bit environment even though you have used, for
> example, pairs of malloc (or calloc) and free.
>
> Also, The problem seems to be only 64-bit specific.
>
> Or maybe it's the problem of LAM 7.1.
Thanks for the bug report. We only have access to one Opteron machine
and it doesn't have Myrinet, so I was wondering if you could run a
couple tests for me to help localize the problem. First, could you
send me the output from the "laminfo" command? There are a number of
places that changed between 7.0 and 7.1, so I'm hoping we can localize
it to a particular component. Does the memory leak happen regardless of
number of processes running?
Also, could you see if it happens with the following SSI options:
-ssi rpi tcp (use tcp instead of gm)
-ssi coll lam_basic (use the really simple collectives code)
You specify the ssi params during mpirun, so something like: "mpirun
-np 4 -ssi rpi tcp ./a.out"
I'm looking at the problem as well, but having some starting points
would really help.
Thanks!
Btian
--
Brian Barrett
LAM/MPI developer and all around nice guy
Have an LAM/MPI day: http://www.lam-mpi.org/
|