I have installed LAM/MPI 7.0.2 on an HPUX 10.20 workstation. The compilation
went successful and I could run the examples and several other test cases
without problems on several nodes. As written on the webpage at "lam dash
mpi dot org" I have attached to this mail the config.log file and the
laminfo output.
The problem I am having is with a CFD program developed with LAM/MPI on
RH-Linux. According to the developers it is running without problems there.
When started on two nodes on the same processor it is working without any
problems. If I start the case on two nodes each on a different workstation
the program doesn't get past MPI_BARRIER. I suppose it could be a problem of
HPUX 10.20, but I don't know.
I need help, as I am not familiar with MPI in anyway besides compiling it.
I run mpi with the following command (tcp is the default)
mpirun n0 n1 program
The error I receive is:
MPI_Recv: process in local group is dead (rank 0, MPI_COMM_WORLD)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Recv()
Rank (0, MPI_COMM_WORLD): - MPI_Barrier()
Rank (0, MPI_COMM_WORLD): - main()
Having added write statements to the program I know it is always crashing
when calling MPI_BARRIER.
The CFD program and the MPI implementation are written in Fortran 90 using
the mpif77 wrapper. When compiling LAM I made sure that mpif77 will use f90
as Fortran compiler.
Hopefully someone can give me a hint as to what the reason might be.
Best regards
Sebastian Henkel
--
Dipl.-Ing. Sebastian Henkel, Naval Architect, TKB Basic Design
Tel. : +49 461 4940-508 FLENSBURGER SCHIFFBAU-GESELLSCHAFT mbH & Co. KG
Fax : +49 461 4940-217 Batteriestrasse 52, D-24939 Flensburg, Germany
E-Mail: henkel at fsg-ship dot de
|