LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Pak, Anne O (anne.o.pak_at_[hidden])
Date: 2003-08-19 11:49:56


hi jeff:
i DID compile with -g :(

these are my compile statements

mpicc -d -g -o slave update_slave.c
mpicc -d -g -o master update_master.c
mpicc -d -g -o MEX_debug MEX_debug.c

and yes i do disconnect the intracommunicator created with
mpi_connect..any more ideas?

-----Original Message-----
From: Jeff Squyres [mailto:jsquyres_at_[hidden]]
Sent: Tuesday, August 19, 2003 9:28 AM
To: General LAM/MPI mailing list
Subject: Re: LAM: valgrind output: error translation needed!!!

On Tue, 19 Aug 2003, Pak, Anne O wrote:

> 1. What does it mean to compile using the --with-purify option? is
this
> an option i invoke when i call mpicc or do i need to activate this
> option somewhere else?

You need to use the --with-purify option when you configure LAM itself,
i.e., when you run LAM's configure script. Without it, the results that
you get from valgrind are littered with all kinds of false positives.

> 2. In the context of MPI, can someone please decipher what the
following
> 2 memory leak errors mean? These are outputs from using VALGRIND:
>
> ==27721== 64 bytes in 4 blocks are definitely lost in loss record 3 of
4
> ==27721== at 0x40026488: malloc (vg_replace_malloc.c:153)
> ==27721== by 0x804A204: connect_to_port (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x8049D67: MPI_Comm_connect (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x8049704: main (in /home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x420158F6: __libc_start_main (in
/lib/i686/libc-2.3.2.so)
> ==27721== by 0x80494E0: (within /home/apak/MATMEXMPI/MEX_debug)

You probably want to compile all your source files with -g because
valgrind can then show you (in this report) exactly which line number
the
errors occur on.

This is saying that there is a memory leak in your MEX_debug program
when
you call MPI_Comm_connect. More specifically, the innards of
MPI_Comm_connect is calling malloc to allocate some memory and that
memory
is never free()'ed. Are you calling MPI_Comm_disconnect?

> ==27721== 68 bytes in 1 blocks are definitely lost in loss record 4 of
4
> ==27721== at 0x40026488: malloc (vg_replace_malloc.c:153)
> ==27721== by 0x412E2B9A: ???
> ==27721== by 0x420FF1B8: __GI_innetgr (in /lib/i686/libc-2.3.2.so)
> ==27721== by 0x412139FD: ???
> ==27721== by 0x412115B0: ???
> ==27721== by 0x420AF693: getpwuid_r@@GLIBC_2.1.2 (in
/lib/i686/libc-2.3.2.so)
> ==27721== by 0x420AED0E: getpwuid (in /lib/i686/libc-2.3.2.so)
> ==27721== by 0x80696AC: killname (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x8069831: sockname (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x80645AD: _cio_init (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x806A31E: _cipc_init (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x8064CA1: kinit (in /home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x8064A2D: kenter (in /home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x804E371: lam_linit (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x804AFB9: MPI_Init (in
/home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x80495E5: main (in /home/apak/MATMEXMPI/MEX_debug)
> ==27721== by 0x420158F6: __libc_start_main (in
/lib/i686/libc-2.3.2.so)
> ==27721== by 0x80494E0: (within /home/apak/MATMEXMPI/MEX_debug)

Simiarly, there are things that are allocated that are never free'd.
However, this looks like stuff that was allocated deep within MPI_Init
(specifically, within getpwuid_r, which is a library call), and you
probably can't do anything about that. So it is safe to ignore.

Memory-checking debuggers like Valgrind can show you all kinds of things
about your application that you never knew -- to include the fact that
system-level libraries (like libc!) leak memory. :-)

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/
_______________________________________________
This list is archived at http://www.lam-mpi.org/MailArchives/lam/