LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-06-14 08:05:52


I'm not aware of any changes in the 7.x series that would cause this kind
of problem. The big change with regards to threads was mentioned earliet
last week (http://www.lam-mpi.org/MailArchives/lam/msg08115.php), but I
don't know enough about NAMD or Charm++ to know why that would make a
difference.

Have you contacted the Charm++ maintainers?

On Thu, 10 Jun 2004, Henning Martinussen wrote:

> Hi LAM users
>
>
> I've encountered a problem when trying to make NAMD 2.5 use
> LAM-MPI 7.0.4. When using LAM-MPI 6.5.9 there is no problem.
>
> I'm using a debian based system running Linux 2.4.26 and
> pthreads from libc 2.3.2
>
>
> The first hint of the problem comes after compiling charm++.
> When running on a single processor
>
> $(CHARMBASE)/mpi-linux/pgms/charm++/megatest/pgm
>
> with either charmrun or directly with mpirun it seg.faults.
> gdb reveals that the segfault id some pthread-code.
>
>
> -------------------------------------------------------------------------
> gdb output from running $(CHARMBASE)/mpi-linux/pgms/charm++/megatest/pgm
> -------------------------------------------------------------------------
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 16384 (LWP 4216)]
> 0x4012038d in __pthread_cleanup_upto () from /lib/libpthread.so.0
> (gdb) bt
> #0 0x4012038d in __pthread_cleanup_upto () from /lib/libpthread.so.0
> #1 0x401931f1 in siglongjmp () from /lib/libc.so.6
> #2 0x4019315c in siglongjmp () from /lib/libc.so.6
> #3 0x40120446 in longjmp () from /lib/libpthread.so.0
> #4 0x08131def in qt_block ()
> #5 0x080e4d3f in CthResume ()
> #6 0x0811d896 in CthResumeNormalThread ()
> #7 0x0811d40d in CmiHandleMessage ()
> #8 0x0811d56f in CsdScheduleForever ()
> #9 0x0811d4ec in CsdScheduler ()
> #10 0x0811c221 in ConverseRunPE ()
> #11 0x0811c4a1 in ConverseInit ()
> #12 0x080f155a in main ()
> #13 0x4017fdc6 in __libc_start_main () from /lib/libc.so.6
> -------------------------------------------------------------------------
>
>
> After building NAMD and running the simple "alanin" test that is
> supplied along with NAMD2.5, a segfault is encountered again, and
> gdb reveals that it arises from the same sequence of calls.
>
> Does anyone have met this problem or know a solution (or work-around)
> for it??
>
>
>
> Kind regards
>
> \Henning
>
> --
> Henning Martinussen | MESH-Technologies A/S
> Systems Developer | Lille Gråbrødrestræde 1
> www.meshtechnologies.com | DK-5000 Odense C, Denmark
> hma_at_[hidden] | mobile: +45 6169 0742
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/