LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2004-09-17 17:12:46


This is quite odd -- I can't think of a reason why this would happen.
Do you get corefiles from your MPI processes, perchance?

Some other suggestions:

- can you lamexec multiple non-MPI processes on a single node? e.g.,
"lamexec n0 n0 uptime"
- can you mpirun a simple hello world MPI process on a single node?
e.g., "mpirun n0 n0 hello"
- if you have ssh running in the cluster, can you try using the rsh/ssh
boot module instead of bproc? I *doubt* that bproc is the issue, but
you never know -- i.e., if you use a different module and the same
results happen, then it's *probably* not the boot module that's at
fault.

On Sep 15, 2004, at 12:56 PM, Kaveh Moallemi - CSCI/P2003 wrote:

> Hello,
>
> I've installed lam-7.0.6 integrated with bproc-3.2.6 on a small 4-node
> cluster. I can execute mpi programs on the cluster and they run just
> fine
> .... however, if I try to run more than one process per node (since the
> nodes are dual processors) I get the following error message:
>
> -----------------------------------------------------------------------
> ------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> -----------------------------------------------------------------------
> ------
>
> I should note that the 3 of the 4 nodes in the cluster are diskless and
> thus have a minimalistic root file system (an 8-Meg ramdisk). I
> suspect
> that I'm doing something wrong in my setup .... does anyone have any
> ideas? Why can I run a single mpi process on each node but not 2?
>
> Any help would be greatly appreciated.
>
>
> Kaveh
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>

-- 
{+} Jeff Squyres
{+} jsquyres_at_[hidden]
{+} http://www.lam-mpi.org/