LAM/MPI logo

LAM/MPI General User's Mailing List Archives

  |   Home   |   Download   |   Documentation   |   FAQ   |   all just in this list

From: Bogdan Costescu (bogdan.costescu_at_[hidden])
Date: 2004-08-11 10:14:23


On Wed, 11 Aug 2004, C.L. Lai [ALAN] wrote:

> Why doesn't SGE execute lamd on the remote nodes?

Because there were not enough slots allocated. (I had this phrase
several times in my previous message).

> You lost the bet.
> the number of slots equals the number of processors on each node as it
> seems from SGE.

I don't know if you are talking about the maximum number of slots that
can be allocated by SGE on the node or about the number of slots
allocated for the job. Even if you have a maximum of 4 slots for a
node, SGE might decide to allocate only one from this node because of
it allocation policies (based on load, for example).
To find out if this is the case, in the batch script before running
lamboot add a line like:

cat $pe_hostfile

Then look in the .o file for the output. The second column specifies
the number of slots allowed on each node.

-- 
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]