Thanks Anthony,
That was very usefull. To elaborate I was running the tests on a 8-way
opteron with 64Gb of memory (8Gb per cpu) and Suse Proffessional 9.2 and
each test typically only uses 1-8Gb of memory total. What you are saying
regarding the non-functioning of NUMA makes a lot of sense, especially
in light of my last test. This test used 8 times more memory than the
first one, but should have been much less communications intensive, yet
I saw significantly reduced performance (-20% efficiency). This of
course utterly confused me, but if as you suggest, the NUMA system has
been disabled then more read/writes would have been non-local leading to
the drop. As far as I know though, the 2.6 smp kernel is NUMA aware, so
perhaps node interleaving is enabled in the BIOS. I will report back
once we have our own machines in any case.
> It would certainly be nice to somehow guarauntee that each task ran
> with its memory locally allocated, but this would require some sort of
> user space interface in the NUMA kernel code, as I believe there is in
> IRIX.
I assume this is not the case in the current linux kernels?
Thanks again,
Eugene
|