LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-04-26 09:23:22


On Mar 30, 2006, at 3:00 PM, Olof Mattsson wrote:

> We're two network and systemadminstration students at the
> University of
> Skövde, Sweden and we've built a OSCAR cluster and now we need to
> benchmark it. We want to run HPCC Benchmark and we've compiled it with
> lam and used ATLAS compiled on our system. The problem is it works
> great
> on three nodes and four CPUs. As soon as we change hpccinf.txt to a
> larger grid (PxQ) to use all 15 computing nodes (16 CPUs, 4x4) the
> benchmark won't start. We have also tried with 2x4,1x8,2x3,3x3,3x4 but
> nothing larger then 2x2 och 1x4 works for us.
> The cluster consists of two dual AMD 2400+ with 2 GB RAM and one of
> these is the masternode, the other is a computing node. Four AMD
> 1900+,
> 1 GB RAM and ten AMD 1900+ with 512 MB RAM. The masternode as two
> nic's
> and the private (eth1) is connected to a Summit4 and all nodes are
> connected to that switch with fastethernet. We use OSCAR 4.2 on Fedora
> Core 3

Sorry about the delay in replying - somehow this message slipped
through the cracks the last month :(.

How is the benchmark failing? Is it crashing, or just appearing to
take a long time? There are some parts of the HPCC suite that do not
scale well, especially over TCP, so they can take a long time to run
as node numbers increase, especially if you have sub-optimal tuning
parameters. If you are crashing, a backtrace would be most helpful.
If you are seeing hangs, a back-trace from where you are hanging
would still be useful...

Brian

-- 
   Brian Barrett
   LAM/MPI developer and all around nice guy
   Have a LAM/MPI day: http://www.lam-mpi.org/