LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Pierre Valiron (valiron_at_[hidden])
Date: 2003-05-21 08:39:27


Hello,

I am developing a big MPI fortran code (for ab-initio quantum chemistry)
using a master/slave paradigm.

The code executes many steps (SCF, transformation steps, CCSD iterations,
etc). Some steps have not been paralleized yet and are purely sequential.
Some steps have been parallelized using MPI-1.

The master task executes all steps.

The slave tasks are waiting for an action to perform (using MPI_Irecv and
looping over MPI_Test + 10 ms sleep). When they receive an action
keyword, they perform it in parallel and return to the idle loop.

I compiled LAM 6.5.9 using gcc version 3.0.4 (Mandrake Linux 8.2
3.0.4-2mdk) and my fortran code uses the latest Intel Fortran 7.1 release
(Build 20030507Z).

I used the following configure options for LAM:
    ./configure --with-fc=ifc --with-fflags="-O -tpp6" \
                --without-romio --with-rpi=XXX \
                --prefix=/usr/local/lam-6.5.9-XXX-ic71

and tested transports XXX = tcp, usysv and sysv.

My platform is an heterogeneous cluster of Linux Athlon PC boxes involving
one 2 GHz biproc and 3 monoprocs ranging from 700 GHz to 1.5 GHz. The
bipro has an gigabit attachment, all other machines have ethernet 100, and
all kernels are 2.4.X. The master task 0 runs on the biproc, tasks 1 to 5
run on the second proc and on the monoprocs.

Execution of parallel steps is a success. I implement some automatic load
balancing for my parallel loops and I achieve an efficiency of about 95%.
The code involves only moderate communications and the results are
insensitive to the rpi selection.

However the idle stages are a disaster.

The idle tasks are eating up to 80-100% cpu in an apparently random
fashion. I could check that the cpu usage is inside LAM because 1) the
total number of sleep invocations matches the elapsed time for each idle
stage, and 2) the cpu usage involved in the sleep (I am using the sleepqq
intel routine) is negligible. Typical idling periods range from one
minute to few days.

I can understand that spinning or polling loops may prove CPU-intensive.
However I am puzzled by the following observations:

a) the CPU usage is apparently stochastic. The idle loop randomly
alternates between burning *lots* of CPU and a fraction of percent during
sizeable intervals (according to top -d 1).

b) the slowest monoprocs remain always idle while the runaway cpu burning
periods only occur for the biproc and the fastest monoproc

c) the problem remains with rpi=tcp.

d) seemingly CPU burning is not immediate and starts after few
seconds.

Could you give me an explanation ?

Even better, could you suggest a cure ?

Best.
Pierre V.

-- 
       _/_/_/_/    _/       _/       Dr. Pierre VALIRON
      _/     _/   _/      _/   Laboratoire d'Astrophysique (UMR 5571 CNRS)
     _/     _/   _/     _/    Observatoire de Grenoble / U. Joseph Fourier
    _/_/_/_/    _/    _/         BP 53  F-38041 Grenoble Cedex 9 (France)
   _/          _/   _/                                   
  _/          _/  _/        http://www-laog.obs.ujf-grenoble.fr
 _/          _/ _/       mailto:Pierre.Valiron_at_[hidden]
_/          _/_/      Phone / Fax: +33 (0)4 76.51.47.87 / (0)4 76.44.88.21