LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Sebastian Henkel (henkel_at_[hidden])
Date: 2003-10-16 07:38:07


I have installed LAM/MPI 7.0.2 on an HPUX 10.20 workstation. The compilation
went successful and I could run the examples and several other test cases
without problems on several nodes. As written on the webpage at "lam dash
mpi dot org" I have attached to this mail the config.log file and the
laminfo output.

The problem I am having is with a CFD program developed with LAM/MPI on
RH-Linux. According to the developers it is running without problems there.
When started on two nodes on the same processor it is working without any
problems. If I start the case on two nodes each on a different workstation
the program doesn't get past MPI_BARRIER. I suppose it could be a problem of
HPUX 10.20, but I don't know.

I need help, as I am not familiar with MPI in anyway besides compiling it.

I run mpi with the following command (tcp is the default)

mpirun n0 n1 program

The error I receive is:

MPI_Recv: process in local group is dead (rank 0, MPI_COMM_WORLD)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Recv()
Rank (0, MPI_COMM_WORLD): - MPI_Barrier()
Rank (0, MPI_COMM_WORLD): - main()

Having added write statements to the program I know it is always crashing
when calling MPI_BARRIER.

The CFD program and the MPI implementation are written in Fortran 90 using
the mpif77 wrapper. When compiling LAM I made sure that mpif77 will use f90
as Fortran compiler.

Hopefully someone can give me a hint as to what the reason might be.

Best regards

Sebastian Henkel

--
Dipl.-Ing. Sebastian Henkel, Naval Architect, TKB Basic Design
Tel.  : +49 461 4940-508    FLENSBURGER SCHIFFBAU-GESELLSCHAFT mbH & Co. KG
Fax   : +49 461 4940-217    Batteriestrasse 52,  D-24939 Flensburg, Germany
E-Mail: henkel at fsg-ship dot de