LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Ryuta Suzuki (suzu0037_at_[hidden])
Date: 2004-11-15 10:44:59


Here's a replicated error:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 23737)]
0x4003c14a in _ksig_init () at ksignal.c:155
155 _kio.ki_sig_handlers[i] = sigign;
(gdb) where
#0 0x4003c14a in _ksig_init () at ksignal.c:155
#1 0x4003b8fc in kinit (priority=1095) at kinit.c:71
#2 0x080495be in main (argc=1, argv=0x804bf50) at lamexec.c:144
#3 0x420158f7 in __libc_start_main () from /lib/i686/libc.so.6

would this be good enough information?

Jeff Squyres wrote:

> Looking at this function (_ksig_init()), it's pretty short and I don't
> see how it could cause a seg fault. :-\
>
> Can you recompile LAM with debugging enabled (i.e., a CFLAGS
> containing -g) and try to replicate this error again? I'd like to
> know the exact line number that this seg fault is occurring on, and
> what the values of the supporting variables are.
>
>
> On Nov 14, 2004, at 9:03 PM, Ryuta Suzuki wrote:
>
>> Here's the back trace:
>>
>> (gdb) run
>> Starting program: /h/suzu0037/Local/MPI/lam/bin/lamexec
>> [New Thread 16384 (LWP 25873)]
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 16384 (LWP 25873)]
>> 0x4003c156 in _ksig_init () from
>> /h/suzu0037/Local/MPI/lam/lib/liblam.so.0
>> (gdb) where
>> #0 0x4003c156 in _ksig_init () from
>> /h/suzu0037/Local/MPI/lam/lib/liblam.so.0
>> #1 0x4003b908 in kinit () from
>> /h/suzu0037/Local/MPI/lam/lib/liblam.so.0
>> #2 0x080495ca in main ()
>> #3 0x420158f7 in __libc_start_main () from /lib/i686/libc.so.6
>>
>> I used Intel compiler 8.1 (icc,icpc,ifort) with "-O3 -xW -tpp7"
>> compiler flag. The configuration options are
>>
>> ./configure --prefix=$HOME/Local/MPI/lam \
>> --enable-shared=yes \
>> --disable-tv \
>> --disable-tv-queue \
>> --with-profiling \
>> --with-threads=posix \
>> --with-purify \
>> --with-trillium \
>> --with-rsh=ssh
>>
>> I can boot LAM and use mpirun without any problem. Will this help?
>>
>> Ryuta
>>
>> Jeff Squyres wrote:
>>
>>> No, this is not a known problem.
>>>
>>> Can you be more specific? What version of LAM are you using? Can
>>> you load up the corefiles in a debugger and send a back trace of
>>> where the problem occurs?
>>>
>>>
>>> On Nov 12, 2004, at 12:11 PM, Ryuta Suzuki wrote:
>>>
>>>> Has anybody experienced seg fault in lamexec and lamgrow? I'm using
>>>> Intel compiler 8.1 and gcc 3.4 is installed in the system. I just
>>>> got seg fault immediately after I invoke lamexec and lamgrow.
>>>> _______________________________________________
>>>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>>>
>>>
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>