Hi
Thanks, This LAM is for some legacy users, we have all flavors mpi's in
the cluster.
OpenMPI, MVAPICH1&2, Intel-mpi,mpich1&2 alll compiled with gcc, icc and
pgi compilers.
All other mpi's are tested and work fine. Except for LAM.
To optimize we have stopped most of the services on the
computenodes. Several of which are still running on the masternode.
Is there any particular service that LAM needs to run ?
regards
Prithu
On Thu, Mar 19, 2009 at 5:10 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> If you're just starting with MPI, I would recommend that you start with
> Open MPI instead of LAM/MPI.
>
> LAM/MPI is in maintenance mode; it has no further development occurring.
>
> Open MPI is where all development is occurring these days. Indeed, Open
> MPI v1.3.1 is just about to be released.
>
> As for the specific problem you're seeing, I don't know why it would be
> happening. Are all of your machines identical in operating system and
> configuration? You might want to re-try the experiment with Open MPI and
> see what happens.
>
>
>
> On Mar 19, 2009, at 2:13 AM, Prithu Tiwari wrote:
>
>
>>
>> ---------- Forwarded message ----------
>> From: Prithu Tiwari <prithubt_at_[hidden]>
>> Date: Wed, Mar 18, 2009 at 3:08 PM
>> Subject: Re: Welcome to the "lam" mailing list
>> To: lam-request_at_[hidden]
>>
>>
>> Hi,
>> We are facing a strange problem when running the lam-mpiexec the
>> example pi program.
>>
>> We installed the lam by compiling from downloaded lam-7.1.4 source and
>> doing configure/make/make install
>> to a particular path (/opt/lam/gcc) using "--prefix".
>> This compilation was done on the masternode and on the rest of the
>> cluster-nodes we simply scp-ed the directory
>> (/opt/lam/gcc) .
>>
>> We compiled the program cpi.c after doing the proper path settings using
>> "mpicc"
>> mpicc cpi.c -o cpi -lm
>> This occurs without any problem.
>> We run this on master node using
>> lamboot mac
>> mac contains hostname of masternode and another node.
>> nx0
>> nx0
>> nx1
>> nx1
>> and execute using following
>> mpiexec -np 4 ./cpi
>> This runs properly and exits without any error.
>>
>> Samething we try from another node like
>> lamboot nod
>> nod contains
>> nx31
>> nx31
>> nx32
>> nx32
>> and execute using:
>> mpiexec -np 4 ./cpi
>> This also runs properly but does not exit - would exit only if we press
>> "enter". If we see the exit status using
>> echo $?
>> it shows 25 !!
>> While submitting to torque also it shows error in pi.err file
>> "mpirun exited with status 1"
>>
>> Can somebody tell why this happening?
>> regards
>> prithu
>>
>>
>> _______________________________________________
>> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> This list is archived at http://www.lam-mpi.org/MailArchives/lam/
>
|