we are implementing the sparse matrix (n x n) vector (n x 1)
multiplication in parallel where matrix is divided column wise (equally
divided columns) and vector is divided accordingly. Each processor
performs local matrix-vector multiplication. A vector of size (n x 1) is
generated on each processor. To get the final resultant vector, all the
local vectors are summed.
We have used MPI_Allreduce to collect the final result.
The matrix that we are processing is 39601 x 39601.
The vector of size 39601 is summed using MPI_Allreduce.
We run this code for different no. of processors.
We get a very strange results.
The time required for processors 1 - 19 goes on increasing and decreases
suddenly for no. of processors 20. After that time remains constant.
# Processors Time for MV
1 0.145209
2 0.123415
3 0.142032
4 0.153569
5 0.154709
6 0.167946
7 0.177953
8 0.195782
9 0.190688
10 0.203196
12 0.224951
14 0.252618
16 0.262348
17 0.271355
18 0.286621
19 0.298761
20 0.105971
22 0.102517
24 0.105836
26 0.102807
28 0.104299
30 0.105835
32 0.10602
We are calculating time using cpu_time (Fortran).
|