LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Llfrg_at_[hidden]
Date: 2004-01-21 14:53:36


Hello,

I am trying to use HPL with LAM. I could successfully compile HPL, but I am
having some run-time problems.

Whenever I execute the HPL software with a problem a little large (say, N
= 1000), one process exits abnormaly, as shown by the message below. I
tried different configurations for P and Q but the problem persists.
Executing small problems (N = 100) the program works fine.

Sample error message:

MPI_Wait: process in local group is dead (rank 0, comm 4)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Wait()
Rank (0, MPI_COMM_WORLD): - main()
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 18713 failed on node n1 due to signal 4.
-----------------------------------------------------------------------------

My hardware configuration is a Pentium 4 (512MB of memory) as main node
and 64 Pentium 3 machines with 256MB of memory each. I am using a Fast
Ethernet 100Mbps for communication.

below is an example HPL.dat file that does not work:

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
8 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
1000 Ns
1 # of NBs
200 NBs
1 # of process grids (P x Q)
2 Ps
2 Qs
16.0 threshold
3 # of panel fact
0 1 2 PFACTs (0=left, 1=Crout, 2=Right)
2 # of recursive stopping criterium
2 4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
3 # of recursive panel fact.
0 1 2 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
0 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)

Any ideas?

Thank you very much,
Leonardo.