LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-10-07 13:11:15


Arf. I found an uninitialized variable that *could* be the problem
here. Could you apply the following patch and tell me if it fixes your
problem?

Index: share/ssi/boot/rsh/src/ssi_boot_rsh.c
===================================================================
--- share/ssi/boot/rsh/src/ssi_boot_rsh.c (revision 9924)
+++ share/ssi/boot/rsh/src/ssi_boot_rsh.c (working copy)
@@ -504,7 +504,7 @@
  #endif
    char *session_suffix = NULL;
    ELEM search;
- ELEM *prefix_keyval;
+ ELEM *prefix_keyval = NULL;
    char prefix_key[] = "prefix";

    local_lamprefix = lamprefix;

On Oct 6, 2004, at 2:07 PM, Ricardo Pereira wrote:

> Hi there,
>
> I'm using lam-7.1.1 and when I try to run lamgrow it gives me a
> segmentation fault.
>
> Then, if I run lamgrow with the -d option it says:
>
> n0<621> ssi:boot:open: opening
> n0<621> ssi:boot:open: opening boot module globus
> n0<621> ssi:boot:open: opened boot module globus
> n0<621> ssi:boot:open: opening boot module rsh
> n0<621> ssi:boot:open: opened boot module rsh
> n0<621> ssi:boot:open: opening boot module slurm
> n0<621> ssi:boot:open: opened boot module slurm
> n0<621> ssi:boot:select: initializing boot module globus
> n0<621> ssi:boot:globus: globus-job-run not found, globus boot will
> not run
> n0<621> ssi:boot:select: boot module not available: globus
> n0<621> ssi:boot:select: initializing boot module rsh
> n0<621> ssi:boot:rsh: module initializing
> n0<621> ssi:boot:rsh:agent: ssh
> n0<621> ssi:boot:rsh:username: <same>
> n0<621> ssi:boot:rsh:verbose: 1000
> n0<621> ssi:boot:rsh:algorithm: linear
> n0<621> ssi:boot:rsh:no_n: 0
> n0<621> ssi:boot:rsh:no_profile: 0
> n0<621> ssi:boot:rsh:fast: 0
> n0<621> ssi:boot:rsh:ignore_stderr: 0
> n0<621> ssi:boot:rsh:priority: 10
> n0<621> ssi:boot:select: boot module available: rsh, priority: 10
> n0<621> ssi:boot:select: initializing boot module slurm
> n0<621> ssi:boot:slurm: not running under SLURM
> n0<621> ssi:boot:select: boot module not available: slurm
> n0<621> ssi:boot:select: finalizing boot module globus
> n0<621> ssi:boot:globus: finalizing
> n0<621> ssi:boot:select: closing boot module globus
> n0<621> ssi:boot:select: finalizing boot module slurm
> n0<621> ssi:boot:slurm: finalizing
> n0<621> ssi:boot:select: closing boot module slurm
> n0<621> ssi:boot:select: selected boot module rsh
> n0<621> ssi:boot: found boot hostname: yyy.yyy.yyy.yyy
> n0<621> ssi:boot: adding node n1
> n0<621> ssi:boot: found existing n0: xxx.xxx.xxx.xxx, cpu=1
> n0<621> ssi:boot: creating empty node n1
> n0<621> ssi:boot: filled n1 data
> n0<621> ssi:boot:rsh: found the following hosts:
> n0<621> ssi:boot:rsh: n0 xxx.xxx.xxx.xxx (cpu=1)
> n0<621> ssi:boot:rsh: n1 yyy.yyy.yyy.yyy (cpu=1)
> n0<621> ssi:boot:rsh: resolved hosts:
> n0<621> ssi:boot:rsh: n0 xxx.xxx.xxx.xxx --> xxx.xxx.xxx.xxx (origin)
> n0<621> ssi:boot:rsh: n1 yyy.yyy.yyy.yyy --> yyy.yyy.yyy.yyy
> n0<621> ssi:boot:rsh: starting RTE procs
> n0<621> ssi:boot:base:linear: starting
> n0<621> ssi:boot:base:server: opening server TCP socket
> n0<621> ssi:boot:base:server: opened port 49763
> n0<621> ssi:boot:base:linear: skipping n0 (xxx.xxx.xxx.xxx); not
> bootable
> n0<621> ssi:boot:base:linear: booting n1 (yyy.yyy.yyy.yyy)
> Bus error
>
> where xxx.xxx.xxx.xxx is the ip of the machine in the lam world and
> yyy.yyy.yyy.yyy is the ip of the machine I want to put in it.
>
> If I lamboot these 2 machines together they work fine.
>
> Can anybody help me?
>
> Ricardo
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/