Hi!
We're two network and systemadminstration students at the University of
Skövde, Sweden and we've built a OSCAR cluster and now we need to
benchmark it. We want to run HPCC Benchmark and we've compiled it with
lam and used ATLAS compiled on our system. The problem is it works great
on three nodes and four CPUs. As soon as we change hpccinf.txt to a
larger grid (PxQ) to use all 15 computing nodes (16 CPUs, 4x4) the
benchmark won't start. We have also tried with 2x4,1x8,2x3,3x3,3x4 but
nothing larger then 2x2 och 1x4 works for us.
The cluster consists of two dual AMD 2400+ with 2 GB RAM and one of
these is the masternode, the other is a computing node. Four AMD 1900+,
1 GB RAM and ten AMD 1900+ with 512 MB RAM. The masternode as two nic's
and the private (eth1) is connected to a Summit4 and all nodes are
connected to that switch with fastethernet. We use OSCAR 4.2 on Fedora
Core 3
This is the hpccinf.txt that we uses when we try to run the benchmark on
16 CPUs:
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
6 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
27476 Ns
1 # of NBs
64 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
4 Ps
4 Qs
16.0 threshold
1 # of panel fact
2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
8 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
1 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0 Number of additional problem sizes for
PTRANS
1200 10000 30000 values of N
0 number of additional blocking sizes for
PTRANS
40 9 8 13 13 20 16 32 64 values of NB
The N value is sqrt(.75*15*536870912/8) since we want to use 15 nodes
and the least avalible RAM at any node is 512 MB
To run the benchmark we've been using (in the hpcc-1.0.0 directory as
oscartst)
mpiexec -v -boot -machinefile ../lamhosts n1-15 -np 16 hpcc to run it.
For the successful run with four CPU we used n1-3 -np 4. n1 is a dual
CPU node and 2 - 15 is singel CPU nodes.
I want to avoid using the head node ( a dual CPU) as a compute node if
it's possible so that's the reason I don't use n0.
I've posted this to the oscar-user mailing list but no one can figure
out why it won't work and I haven't found a mailing list dedicated to
HPCC benchmarking. If you know of a better list to post this please tell me.
Any help or ideas would be greatly apriciated!
Regards
Olof Mattsson
|