If you're just starting with MPI, I would recommend that you start
with Open MPI instead of LAM/MPI.
LAM/MPI is in maintenance mode; it has no further development occurring.
Open MPI is where all development is occurring these days. Indeed,
Open MPI v1.3.1 is just about to be released.
As for the specific problem you're seeing, I don't know why it would
be happening. Are all of your machines identical in operating system
and configuration? You might want to re-try the experiment with Open
MPI and see what happens.
On Mar 19, 2009, at 2:13 AM, Prithu Tiwari wrote:
>
>
> ---------- Forwarded message ----------
> From: Prithu Tiwari <prithubt_at_[hidden]>
> Date: Wed, Mar 18, 2009 at 3:08 PM
> Subject: Re: Welcome to the "lam" mailing list
> To: lam-request_at_[hidden]
>
>
> Hi,
> We are facing a strange problem when running the lam-mpiexec
> the example pi program.
>
> We installed the lam by compiling from downloaded lam-7.1.4 source
> and doing configure/make/make install
> to a particular path (/opt/lam/gcc) using "--prefix".
> This compilation was done on the masternode and on the rest of the
> cluster-nodes we simply scp-ed the directory
> (/opt/lam/gcc) .
>
> We compiled the program cpi.c after doing the proper path settings
> using "mpicc"
> mpicc cpi.c -o cpi -lm
> This occurs without any problem.
> We run this on master node using
> lamboot mac
> mac contains hostname of masternode and another node.
> nx0
> nx0
> nx1
> nx1
> and execute using following
> mpiexec -np 4 ./cpi
> This runs properly and exits without any error.
>
> Samething we try from another node like
> lamboot nod
> nod contains
> nx31
> nx31
> nx32
> nx32
> and execute using:
> mpiexec -np 4 ./cpi
> This also runs properly but does not exit - would exit only if we
> press "enter". If we see the exit status using
> echo $?
> it shows 25 !!
> While submitting to torque also it shows error in pi.err file
> "mpirun exited with status 1"
>
> Can somebody tell why this happening?
> regards
> prithu
>
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
--
Jeff Squyres
Cisco Systems
|