I actually send this a while ago, but I was registered under an old
e-mail address and it bounced. I hope it is still of some value :-)
>> Hello all,
>> I have been testing the performance of an IWILL H8501 8-way opteron
>> system and have been getting some surprisingly bad results. Running the
>> same code on different numbers of nodes produces the following speed-ups:
>> 2 cpus - 2.25 (yes, super-linear)
>> 4 cpus - 3.3
>> 8 cpus - 5.5
This is unfortunately the artefact from the kernel. Your problem is
very probably that the single processor number is very bad, since the
process cannot decide on which CPU to be executed. If available try to
run the same job on the single processor machine, or compile the
kernel without SMP, or boot the machine with the maxcpu=1 boot
parameter (I am not sure for the exact name but it is something like
this). We had the same problem. There is another way to improve this
number by running 7 times yes &> /dev/zero commands. Then all the CPUs
are busy and your real one will be "confined" on one of them. We
'fixed' the problem by installing newer kernel...
Milan Hodoscek
|