LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres \(jsquyres\) (jsquyres_at_[hidden])
Date: 2006-04-29 13:36:51


Hm. This is quite fishy.
 
Can you please send the output requested by
http://www.lam-mpi.org/using/support/ for the build when you specified
--without-memory-manager? (PLEASE COMPRESS)
 
Also check your path and ensure that you are running the lamboot from
your most recent install (e.g., try "which lamboot").

________________________________

        From: lam-bounces_at_[hidden] [mailto:lam-bounces_at_[hidden]]
On Behalf Of J G Che
        Sent: Friday, April 28, 2006 9:21 PM
        To: General LAM/MPI mailing list
        Subject: Re: LAM: can gcc 3.2 and kernel 2.4.20 suit lam-7.1.2
or not? Or other problem for lam-7.1.2?
        
        
        Thanks!
        I compiled with option about memory, see config.log, and type bt
after gdb stoped, the output please see below:
         

        jgche: ~/lam-test\>lamboot -v lamhosts

        Segmentation fault

        jgche: ~/lam-test\>gdb lamboot

        GNU gdb Red Hat Linux (5.2.1-4)

        Copyright 2002 Free Software Foundation, Inc.

        GDB is free software, covered by the GNU General Public License,
and you are

        welcome to change it and/or distribute copies of it under
certain conditions.

        Type "show copying" to see the conditions.

        There is absolutely no warranty for GDB. Type "show warranty"
for details.

        This GDB was configured as "i386-redhat-linux"...

        (gdb) bt

        No stack.

        (gdb) run

        Starting program: /people/jgche/lam-7.1.2-debug/bin/lamboot

        [New Thread 8192 (LWP 28340)]

         

        Program received signal SIGTRAP, Trace/breakpoint trap.

        [Switching to Thread 8192 (LWP 28340)]

        0x00000000 in ?? ()

        (gdb) bt

        #0 0x00000000 in ?? ()

        (gdb) quit

        The program is running. Exit anyway? (y or n) y

        jgche: ~/lam-test\>laminfo

                     LAM/MPI: 7.1.2

        Segmentation fault

        jgche: ~/lam-test\>
         
        no output in the directory.
         
        In the cluster myrinet is installed, however, I did not use gm
option to compile lam. I have also tried to compile it with gm swiches,
the problem is the same.
         
        JG

                ----- Original Message -----
                From: Jeff Squyres (jsquyres)
<mailto:jsquyres_at_[hidden]>
                To: General LAM/MPI mailing list
<mailto:lam_at_[hidden]>
                Sent: Friday, April 28, 2006 8:17 PM
                Subject: Re: LAM: can gcc 3.2 and kernel 2.4.20 suit
lam-7.1.2 or not? Or other problem for lam-7.1.2?

                Where gdb stops and gives you the "(gdb)" prompt, type
"bt" and hit enter. This will give us a backtrace and show us exactly
where it stopped.
                 
                Can you send the output of laminfo? If laminfo fails to
run, can you configure LAM with the following configure switch:
--with-memory-manager=none. This *feels* like a memory manager problem,
but the environment you listed should not be a problem (gcc 3.2, kernel
2.4.20). Are you using a high speed network such as Myrinet or
Infiniband?

________________________________

                        From: lam-bounces_at_[hidden]
[mailto:lam-bounces_at_[hidden]] On Behalf Of J G Che
                        Sent: Friday, April 28, 2006 1:08 AM
                        To: General LAM/MPI mailing list
                        Subject: Re: LAM: can gcc 3.2 and kernel 2.4.20
suit lam-7.1.2 or not? Or other problem for lam-7.1.2?
                        
                        
                        Thanks! I tried:
                         

                        jgche: ~/lam-test\>lamboot

                        Segmentation fault

                        jgche: ~/lam-test\>gdb lamboot

                        GNU gdb Red Hat Linux (5.2.1-4)

                        Copyright 2002 Free Software Foundation, Inc.

                        GDB is free software, covered by the GNU General
Public License, and you are

                        welcome to change it and/or distribute copies of
it under certain conditions.

                        Type "show copying" to see the conditions.

                        There is absolutely no warranty for GDB. Type
"show warranty" for details.

                        This GDB was configured as
"i386-redhat-linux"...

                        (gdb) run

                        Starting program:
/people/jgche/lam-7.1.2-debug/bin/lamboot

                        [New Thread 8192 (LWP 24731)]

                         

                        Program received signal SIGTRAP,
Trace/breakpoint trap.

                        [Switching to Thread 8192 (LWP 24731)]

                        0x00000000 in ?? ()

                        (gdb)

                         

                        I don't know how to go on? Could you please give
me more detail?

                         

                        I have also tried to install lam-7.0.4, fault
with the same reason, "Segmentation fault".

                         

                        JG

                                ----- Original Message -----
                                From: Jeff Squyres (jsquyres)
<mailto:jsquyres_at_[hidden]>
                                To: General LAM/MPI mailing list
<mailto:lam_at_[hidden]>
                                Sent: Thursday, April 27, 2006 7:17 PM
                                Subject: Re: LAM: can gcc 3.2 and kernel
2.4.20 suit lam-7.1.2 or not? Or other problem for lam-7.1.2?

                                This is certainly quite odd and should
not happen.
                                 
                                Can you try running "lamboot -d
lamhosts" with 7.1.2? That might give a bit more output.
                                 
                                If that doesn't reveal anything useful,
could you recompile LAM with debugging symbols enabled (e.g.,
"./configure CFLAGS=-g ...."), ensure that your coredumpsize is
unlimited, and run it again? This should then generate a corefile -- if
you could send the backtrace from that, it would be most useful.
                                 
                                Thanks!

________________________________

                                From: lam-bounces_at_[hidden]
[mailto:lam-bounces_at_[hidden]] On Behalf Of J G Che
                                Sent: Thursday, April 27, 2006 1:45 AM
                                To: General LAM/MPI mailing list
                                Subject: LAM: can gcc 3.2 and kernel
2.4.20 suit lam-7.1.2 or not? Or other problem for lam-7.1.2?
                                
                                
                                
                                I cannot install lam-7.1.2 on our
cluster with dual Xeon and myrinet. Its gcc version is:
                                 
                                jgche: ~\>gcc -v
                                Reading specs from
/usr/lib/gcc-lib/i386-redhat-linux/3.2/specs
                                Configured with: ../configure
--prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info
--enable-shared --enable-threads=posix --disable-checking
--host=i386-redhat-linux --with-system-zlib --enable-__cxa_atexit
                                Thread model: posix
                                gcc version 3.2 20020903 (Red Hat Linux
8.0 3.2-7)
                                 
                                its kernel seems to be 2.4.20-28.8smp
(I'm not a manager, who will not install lam-mpi, thus I want install
for myself)
                                 
                                I compiled lam-7.1.2 without problem,
please see the attached config.7.1.2.log and make.7.1.2.log. However,
when I run lamboot, I got
                                 
                                jgche: ~\>cat lamhosts
                                admin1
                                jgche: ~\>lamboot -v lamhosts
                                Segmentation fault
                                jgche: ~\>
                                 
                                Except for mpif77, mpicc, mpic++, if I
excuted any other excutable files in /people/jgche/lam-7.1.2-eth/bin, I
got "Segmentation fault"! I cannot fix the problem. Thus, I tried to
install lam-6.5.7, since I thought this version was released in Oct
2002, almost the same time as that of gcc 3.2. And now it seemed to be
ok.
                                 
                                jgche: ~\>rm lam-eth
                                jgche: ~\>ln -s lam-6.5.7-eth/ lam-eth
                                jgche: ~\>lamboot -v lamhosts
                                 
                                LAM 6.5.7/MPI 2 C++/ROMIO - Indiana
University
                                 
                                Executing hboot on n0 (admin1 - 1
CPU)...
                                topology done
                                 
                                please refer also to the attached
config.6.5.7.log and make.6.5.7.log.
                                 
                                What is this problem? Is the gcc version
problem? or kernel? or others? How can I fix the problem?
                                 
                                Thanks!
                                 
                                JG
                                 

                                
________________________________

                                

        
_______________________________________________
                                This list is archived at
http://www.lam-mpi.org/MailArchives/lam/

                
________________________________

                

                _______________________________________________
                This list is archived at
http://www.lam-mpi.org/MailArchives/lam/