Hi Sriram,
First of all: Thank you very much! All questions answered -
all problems solved :-)
> The high CPU usage in case of usysv is due to the fact that
> synchronization of access to shared memory is implemented using spin
> locks. Alternately, you can use the sysv rpi in which synchronization is
> done using System V semaphores; CPU utilization should be *significantly*
> lesser in this case.
Okay! Sorry - I should have read the LAM User'S Guide before posting :-/
I thought usysv was the one without spin locks.
> I ran my version of the same test on a very similar setup (2 nodes, 2 cpus
> each, 1GB memory running Solaris 8), and I see that tcp and usysv have
> comparable execution times.
> [...]
> Considering that you were seeing 50% cpu cycles being spent on iowait, it
> is possible that there are other processes that are utilizing resources
> heavily on those nodes. Try running your example program when the
> machines are lightly loaded; you might see better results.
You are absolutely right! I had a running oracle database on the machine.
top doesn't show how much memory is used by oracle (no idea why), what you
get is just:
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
[...]
17237 oracle 11 0 0 0K 0K sleep 0:00 0.00% oracle
[...]
So I didn't realize that much memory is already used by oracle.
In fact the oracle processes use several hundered MB of main memory.
I shut down oracle and the execution times were totally different, I have
no longer CPU usage for iowait (no swapping any more).
> mpirun -c 5 -ssi rpi tcp program
Time: 22.4411
> mpirun -c 5 -ssi rpi usysv program
Time: 48.5055
> mpirun -c 5 -ssi rpi sysv program
Time: 13.1972
As you see, for tcp I have now 22 sec instead of 125 sec (with running oracle
and swapping). sysv is even faster :-) Should be the most interesting RPI
for my machine...
> * You can turn on debugging info for the RPI SSI modules by using the
>
> -ssi rpi_verbose level:X
>
> flag to mpirun. X can take on any value in the range -1 to 50.
>
> X = -1 (no debugging)
> X = 0 (minimal debugging info)
> X = 50 (maximum debugging info)
Okay - I found out that the default is tcp. Thanks.
> Hope this helps.
It helps 100%. Thank you very much!
Charlie
|