Where gdb stops and gives you the "(gdb)" prompt, type "bt" and hit
enter. This will give us a backtrace and show us exactly where it
stopped.
Can you send the output of laminfo? If laminfo fails to run, can you
configure LAM with the following configure switch:
--with-memory-manager=none. This *feels* like a memory manager problem,
but the environment you listed should not be a problem (gcc 3.2, kernel
2.4.20). Are you using a high speed network such as Myrinet or
Infiniband?
________________________________
From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]]
On Behalf Of J G Che
Sent: Friday, April 28, 2006 1:08 AM
To: General LAM/MPI mailing list
Subject: Re: LAM: can gcc 3.2 and kernel 2.4.20 suit lam-7.1.2
or not? Or other problem for lam-7.1.2?
Thanks! I tried:
jgche: ~/lam-test\>lamboot
Segmentation fault
jgche: ~/lam-test\>gdb lamboot
GNU gdb Red Hat Linux (5.2.1-4)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License,
and you are
welcome to change it and/or distribute copies of it under
certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty"
for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) run
Starting program: /people/jgche/lam-7.1.2-debug/bin/lamboot
[New Thread 8192 (LWP 24731)]
Program received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 8192 (LWP 24731)]
0x00000000 in ?? ()
(gdb)
I don't know how to go on? Could you please give me more detail?
I have also tried to install lam-7.0.4, fault with the same
reason, "Segmentation fault".
JG
----- Original Message -----
From: Jeff Squyres (jsquyres)
<mailto:jsquyres_at_[hidden]>
To: General LAM/MPI mailing list
<mailto:lam_at_[hidden]>
Sent: Thursday, April 27, 2006 7:17 PM
Subject: Re: LAM: can gcc 3.2 and kernel 2.4.20 suit
lam-7.1.2 or not? Or other problem for lam-7.1.2?
This is certainly quite odd and should not happen.
Can you try running "lamboot -d lamhosts" with 7.1.2?
That might give a bit more output.
If that doesn't reveal anything useful, could you
recompile LAM with debugging symbols enabled (e.g., "./configure
CFLAGS=-g ...."), ensure that your coredumpsize is unlimited, and run it
again? This should then generate a corefile -- if you could send the
backtrace from that, it would be most useful.
Thanks!
________________________________
From: lam-bounces_at_[hidden]
[mailto:lam-bounces_at_[hidden]] On Behalf Of J G Che
Sent: Thursday, April 27, 2006 1:45 AM
To: General LAM/MPI mailing list
Subject: LAM: can gcc 3.2 and kernel 2.4.20 suit
lam-7.1.2 or not? Or other problem for lam-7.1.2?
I cannot install lam-7.1.2 on our cluster with
dual Xeon and myrinet. Its gcc version is:
jgche: ~\>gcc -v
Reading specs from
/usr/lib/gcc-lib/i386-redhat-linux/3.2/specs
Configured with: ../configure --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info --enable-shared
--enable-threads=posix --disable-checking --host=i386-redhat-linux
--with-system-zlib --enable-__cxa_atexit
Thread model: posix
gcc version 3.2 20020903 (Red Hat Linux 8.0
3.2-7)
its kernel seems to be 2.4.20-28.8smp (I'm not a
manager, who will not install lam-mpi, thus I want install for myself)
I compiled lam-7.1.2 without problem, please see
the attached config.7.1.2.log and make.7.1.2.log. However, when I run
lamboot, I got
jgche: ~\>cat lamhosts
admin1
jgche: ~\>lamboot -v lamhosts
Segmentation fault
jgche: ~\>
Except for mpif77, mpicc, mpic++, if I excuted
any other excutable files in /people/jgche/lam-7.1.2-eth/bin, I got
"Segmentation fault"! I cannot fix the problem. Thus, I tried to install
lam-6.5.7, since I thought this version was released in Oct 2002, almost
the same time as that of gcc 3.2. And now it seemed to be ok.
jgche: ~\>rm lam-eth
jgche: ~\>ln -s lam-6.5.7-eth/ lam-eth
jgche: ~\>lamboot -v lamhosts
LAM 6.5.7/MPI 2 C++/ROMIO - Indiana University
Executing hboot on n0 (admin1 - 1 CPU)...
topology done
please refer also to the attached
config.6.5.7.log and make.6.5.7.log.
What is this problem? Is the gcc version
problem? or kernel? or others? How can I fix the problem?
Thanks!
JG
________________________________
_______________________________________________
This list is archived at
http://www.lam-mpi.org/MailArchives/lam/
|