LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Bryan O'Sullivan (bos_at_[hidden])
Date: 2005-02-04 02:26:19


Now that I have lamboot not crashing with slurm 0.3.10, I find that the
nightly SVN lamboot causes srun to hang in the following scenario:

        $ srun -n 2 -A
        $ lamboot
        $ exit
        $ exit
        srun: error: eng-24: task0: Killed
        srun: Terminating job

This leaves behind a lamd process, which is stuck here:

        #0 0x0000003c484be455 in __select_nocancel () from /lib64/tls/libc.so.6
        #1 0x000000000040cf52 in kio_req ()
            at ../../../../otb/sys/kernel/kernelio.c:331
        #2 0x000000000040e297 in run_kernel (argc=1, argv=0x7fbffff308)
            at ../../../../otb/sys/kernel/kouter.c:176
        #3 0x0000000000404a9a in main (argc=1, argv=0x7fbffff308)
            at ../../../../otb/sys/lamd/lamd_main.c:105

More details tomorrow.

        <b