On Thu, 20 May 2004, Malar Chinnusamy wrote:
> In my program which I run with 12 processes, with 1 process/node, all even
> processes send data (1.6*10^9 bytes) to odd processes.
So, reduced to 4 processes, is it like:
0->1 and 2->3
or
0->(1 and 3) and 2->(1 and 3)
> All the processes take ~136 sec to finish. This indicates that there
> is no congestion right...
1.6Gb/136sec ~ 11 Mb/sec which is the maximum that a FastEthernet link
could do. So you are in fact bandwidth limited.
> rank 0 time taken for receiving from process 2 is 136.275022 for 200000000
> rank 2 time taken is 136.266951 for 200000000
Is the second line the time measured on the sender side ?
> rank 0 time taken for receiving from process 1 is 516.601192 for 200000000
> rank 1 time taken is 652.565886 for 200000000
If the answer to the above question is 'yes', I would suggest to try
and find out first why 2->0 has almost equal times on sender and
receiver sides, while 1->0 has significant differences between sender
and receiver.
Another question is related to the amount of data: how much memory do
you have on these nodes ? Are you sure that the data set kept in any
node's memory (including rank 0) never exceeds the physical RAM ?
> Can you please tell explain why this happens. Its only for large messages.
Is there some cut-off point ? Or the difference is just increasing as
the message size increases ? If so, how much ?
--
Bogdan Costescu
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu_at_[hidden]
|